class Client
Bases: BaseClient
Client to connect to Sumatra GraphQL API
Humans: First, log in via the CLI: sumatra login
Bots: Set the SUMATRA_INSTANCE
and SUMATRA_SDK_KEY
environment variables
Attributes
boto: boto3.Session
property
Boto3 session object
branch: str
property
writable
Default branch name
instance: str
property
Instance name from client config, e.g. 'yourco.sumatra.ai'
workspace: Optional[str]
property
User's current workspace slug, e.g. my-workspace
workspace_id: Optional[str]
property
User's current workspace id, e.g. 01ee8330-edf4-07ae-ae19-3ab915f227c8
Functions
__init__(instance=None, branch=None, workspace=None)
Create connection object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
instance |
Optional[str]
|
Sumatra instance url, e.g. |
None
|
branch |
Optional[str]
|
Set default branch. If unspecified, your config default will be used. |
None
|
workspace |
Optional[str]
|
Sumatra workspace name to connect to. |
None
|
api_key()
Return the API key for the connected workspace
Returns:
Type | Description |
---|---|
str
|
API key |
clone_branch(dest, branch=None)
Copy branch to another branch name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dest |
str
|
Name of branch to be created or overwritten. |
required |
branch |
Optional[str]
|
Specify a source branch other than the client default. |
None
|
create_branch_from_dir(scowl_dir=None, branch=None, deps_file=None)
Create (or overwrite) branch with local scowl files.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
scowl_dir |
Optional[str]
|
Path to local .scowl files. |
None
|
branch |
Optional[str]
|
Specify a source branch other than the client default. |
None
|
deps_file |
Optional[str]
|
Path to deps file [default: |
None
|
Returns:
Type | Description |
---|---|
str
|
Name of branch created |
create_branch_from_scowl(scowl, branch=None)
Create (or overwrite) branch with single file of scowl source code.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
scowl |
str
|
Scowl source code as string. |
required |
branch |
Optional[str]
|
Specify a source branch other than the client default. |
None
|
Returns:
Type | Description |
---|---|
str
|
Name of branch created |
create_model_from_pmml(model, filename, comment=None)
Create (or overwrite) model from PMML file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
str
|
Model name, e.g. "churn_predictor". |
required |
filename |
str
|
Local PMML file, e.g. "my_model.xml" |
required |
comment |
Optional[str]
|
A comment string to store with the model version. Max 60 characters. Optional |
None
|
Returns:
Type | Description |
---|---|
str
|
A |
create_table_from_dataframe(table, df, key_column, include_index=False)
Create (or overwrite) table from a DataFrame
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table |
str
|
Table name. |
required |
df |
DataFrame
|
DataFrame to upload as table |
required |
key_column |
str
|
Name of column containing the prmary index for the table |
required |
include_index |
bool
|
Include the DataFrame's index as a column named |
False
|
Returns:
Type | Description |
---|---|
TableVersion
|
A |
create_timeline_from_dataframes(timeline, df_dict)
Create (or overwrite) timeline from a collection of DataFrames—one per event type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
timeline |
str
|
Timeline name. |
required |
df_dict |
dict
|
Dictionary from event type name to DataFrame of events. |
required |
create_timeline_from_file(timeline, filename)
Create (or overwrite) timeline from events stored in a file.
Supported file types: .jsonl
, .jsonl.gz
Parameters:
Name | Type | Description | Default |
---|---|---|---|
timeline |
str
|
Timeline name. |
required |
filename |
str
|
Name of events file to upload. |
required |
create_timeline_from_jsonl(timeline, jsonl)
Create (or overwrite) timeline from JSON events passed in as a string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
timeline |
str
|
Timeline name. |
required |
jsonl |
str
|
JSON event data, one JSON dict per line. |
required |
create_timeline_from_log(timeline, start_ts, end_ts, event_types=None)
Create (or overwrite) timeline from the Event Log
Parameters:
Name | Type | Description | Default |
---|---|---|---|
timeline |
str
|
Timeline name. |
required |
start_ts |
Union[DateTime, str]
|
Earliest event timestamp to fetch (local client timezone). |
required |
end_ts |
Union[DateTime, str]
|
Latest event timestamp to fetch (local client timezone). |
required |
event_types |
Optional[list[str]]
|
Event types to include (default: all). |
None
|
create_timeline_from_s3(timeline, s3_uri, time_path, data_path, id_path=None, type_path=None, default_type=None)
Create (or overwrite) timeline from a JSON file on S3
Parameters:
Name | Type | Description | Default |
---|---|---|---|
timeline |
str
|
Timeline name. |
required |
s3_uri |
str
|
S3 bucket URI. |
required |
time_path |
str
|
JSON path where event timestamp is found (e.g. $._time) |
required |
data_path |
str
|
JSON path where event payload is found (e.g. $) |
required |
id_path |
Optional[str]
|
JSON path where event ID is found (e.g. $.event_id) |
None
|
type_path |
Optional[str]
|
JSON path where event type is found (e.g. $._type) |
None
|
default_type |
Optional[str]
|
Event type to use in case none found at |
None
|
delete_branch(branch=None)
Delete server-side branch
Parameters:
Name | Type | Description | Default |
---|---|---|---|
branch |
Optional[str]
|
Specify a branch other than the client default. |
None
|
delete_model(model)
Delete a model permanently.
If the model is referenced in the LIVE topology, it cannot be deleted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
str
|
Model name. |
required |
delete_model_version(model, version)
Delete a specific version of a model permanently.
If the model version is referenced in the LIVE topology, it cannot be deleted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
str
|
Model name. |
required |
version |
str
|
Version identifier. |
required |
delete_openai_config()
Delete the current OpenAI configuration
delete_table(table)
Delete a table permanently.
If the table is referenced in the LIVE topology, it cannot be deleted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table |
str
|
Table name. |
required |
delete_table_version(table, version)
Delete a specific version of a table permanently.
If the table version is referenced in the LIVE topology, it cannot be deleted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table |
str
|
Table name. |
required |
version |
str
|
Version identifier. |
required |
delete_timeline(timeline)
Delete timeline
Parameters:
Name | Type | Description | Default |
---|---|---|---|
timeline |
str
|
Timeline name. |
required |
diff_branch_with_live(branch=None)
Compare branch to LIVE topology and return diff.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
branch |
Optional[str]
|
Specify a source branch other than the client default. |
None
|
Returns:
Type | Description |
---|---|
dict[str, list[str]]
|
Events and features added, redefined, and deleted. |
execute_athena(sql)
Execute a SQL query against the Athena backend
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sql |
str
|
SQL query (e.g. "select * from event_log where event_type='login' limit 10") |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
Query execution id |
get_branch(branch=None)
Return metadata about the branch.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
branch |
Optional[str]
|
Specify a branch other than the client default. |
None
|
Returns:
Type | Description |
---|---|
dict
|
Branch metadata |
get_branches()
Return all branches and their metadata.
Returns:
Type | Description |
---|---|
list[dict]
|
Branch metadata. |
get_deps(live=False)
Fetch latest dependencies from server as Scowl source require
statements.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
live |
bool
|
Return the LIVE versions of dependencies instead of latest. |
False
|
Returns:
Type | Description |
---|---|
str
|
Scowl source code as string. |
get_error_counts(start_ts, end_ts, event_types=None)
Return the number of errors from CloudWatch logs in the given time range.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
start_ts |
Union[DateTime, str]
|
Earliest event timestamp to count (local client timezone). |
required |
end_ts |
Union[DateTime, str]
|
Latest event timestamp to count (local client timezone). |
required |
event_types |
Optional[list[str]]
|
List of event types to include. If None, include all event types in LIVE topology. |
None
|
get_event_counts(start_ts, end_ts, event_types=None)
Return the number of events from CloudWatch logs in the given time range.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
start_ts |
Union[DateTime, str]
|
Earliest event timestamp to count (local client timezone). |
required |
end_ts |
Union[DateTime, str]
|
Latest event timestamp to count (local client timezone). |
required |
event_types |
Optional[list[str]]
|
List of event types to include. If None, include all event types in LIVE topology. |
None
|
get_features_from_feed(event_type, start_ts=None, end_ts=None, count=None, where={}, batch_size=10000, ascending=False)
For a given event type, return the feature values as they were calculated at event time.
Fetches events in descending time order from end_ts
. May specify count
or start_ts
, but not both.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
event_type |
str
|
Event type name. |
required |
start_ts |
Optional[Union[DateTime, str]]
|
Earliest event timestamp to fetch (local client timezone). If not specified, |
None
|
end_ts |
Optional[Union[DateTime, str]]
|
Latest event timestamp to fetch (local client timezone) [default: now]. |
None
|
count |
Optional[int]
|
Number of rows to return (if start_ts not specified) [default: 10]. |
None
|
where |
dict[str, str]
|
dictionary of equality conditions (all must be true for a match), e.g. {"zipcode": "90210", "email_domain": "gmail.com"}. |
{}
|
batch_size |
int
|
Maximum number of records to fetch per GraphQL call. |
10000
|
ascending |
bool
|
Sort results in ascending chronological order instead of descending. |
False
|
Returns:
Name | Type | Description |
---|---|---|
rows |
list[dict]
|
_id, _time, [features...] (in descending time order). |
get_features_from_log(event_type, start_ts=None, end_ts=None, features=None, include_inputs=False, where=None, deserialize_json=True)
For a given event type, fetch the historical values for features, as calculated in the LIVE environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
event_type |
str
|
Event type name. |
required |
start_ts |
Optional[Union[DateTime, str]]
|
Earliest event timestamp to fetch (local client timezone). If not specified, will start from beginning of log. |
None
|
end_ts |
Optional[Union[DateTime, str]]
|
Latest event timestamp to fetch (local client timezone) [default: now]. |
None
|
features |
Optional[list[str]]
|
Subset of features to fetch. [default: all]. |
None
|
include_inputs |
bool
|
Include request json as "_inputs" column. |
False
|
where |
Optional[str]
|
SQL clauses (not including "where" keyword), e.g. "col1 is not null" |
None
|
deserialize_json |
bool
|
Deserialize complex data types from JSON strings to Python objects. |
True
|
Returns:
Name | Type | Description |
---|---|---|
rows |
DataFrame
|
_id, _time, [features...] (in ascending time order). |
get_inputs_from_feed(start_ts=None, end_ts=None, count=None, event_types=None, where={}, batch_size=10000, ascending=False)
Return the raw input events from the Event Feed.
Fetches events in descending time order from end_ts
. May specify count
or start_ts
, but not both.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
start_ts |
Optional[Union[DateTime, str]]
|
Earliest event timestamp to fetch (local client timezone). If not specified, |
None
|
end_ts |
Optional[Union[DateTime, str]]
|
Latest event timestamp to fetch (local client timezone) [default: now]. |
None
|
count |
Optional[int]
|
Number of rows to return (if start_ts not specified) [default: 10]. |
None
|
event_types |
Optional[list[str]]
|
Subset of event types to fetch. [default: all] |
None
|
where |
dict[str, str]
|
dictionary of equality conditions (all must be true for a match), e.g. {"zipcode": "90210", "email_domain": "gmail.com"}. |
{}
|
batch_size |
int
|
Maximum number of records to fetch per GraphQL call. |
10000
|
ascending |
bool
|
Sort results in ascending chronological order instead of descending. |
False
|
Returns:
Type | Description |
---|---|
list[dict]
|
list of events: [{"_id": , "_type": , "_time": , [inputs...]}] (in descending time order). |
get_live_schema()
Return the feature names and types for every event in the LIVE topology
Returns:
Type | Description |
---|---|
dict[str, dict[str, str]]
|
dictionary {'event_name': {'f1': 'int', 'f2': 'bool', ...} ...} |
get_live_scowl()
Return scowl source code for LIVE topology as single cleansed string.
Returns:
Type | Description |
---|---|
str
|
Scowl source code as string. |
get_model_history(name)
Return list of versions for the given model along with their metadata.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Model name. |
required |
Returns:
Type | Description |
---|---|
list[dict]
|
Model version metadata. |
get_model_version(name, version)
Return handle to a specific model version.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Model name. |
required |
version |
str
|
Model version. |
required |
Returns:
Type | Description |
---|---|
ModelVersion
|
Model version future object. |
get_models()
Return all models and their metadata.
Returns:
Type | Description |
---|---|
list[dict]
|
Model metadata. |
get_models_openai()
Return all OpenAI models and their metadata.
Returns:
Type | Description |
---|---|
list[dict]
|
OpenAI Model metadata. |
get_openai_config()
Return the current OpenAI model configuration, if any
Returns:
Type | Description |
---|---|
dict
|
OpenAI Model configuration state. |
get_settings()
Return settings metadata about the current workspace.
Returns:
Type | Description |
---|---|
dict
|
Workspace settings |
get_table_history(name)
Return list of versions for the given table along with their metadata.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Table name. |
required |
Returns:
Type | Description |
---|---|
list[dict]
|
DataFrame of version metadata. |
get_table_version(name, version)
Return handle to a specific table version.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Table name. |
required |
version |
str
|
Table version. |
required |
Returns:
Type | Description |
---|---|
TableVersion
|
Table version future object. |
get_tables()
Return all tables and their metadata.
Returns:
Type | Description |
---|---|
list[dict]
|
Table metadata. |
get_timeline(timeline)
Return metadata about the timeline.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
timeline |
str
|
Timeline name. |
required |
Returns:
Type | Description |
---|---|
dict
|
Timeline metadata. |
get_timelines()
Return all timelines and their metadata.
Returns:
Type | Description |
---|---|
list[dict]
|
Timeline metadata. |
infer_schema_from_timeline(timeline)
Attempt to infer the paths and data types of all fields in the timeline's input data. Generate the scowl to parse all JSON paths.
This function helps bootstrap scowl code for new event types, with the expectation that most feature names will need to be modified.
e.g.
account_id := $.account.id as int
purchase_items_0_amount := $.purchase.items[0].amount as float
Parameters:
Name | Type | Description | Default |
---|---|---|---|
timeline |
str
|
Timeline name. |
required |
Returns:
Type | Description |
---|---|
str
|
Scowl source code as string. |
invite_user(email, role, resend_email=True, app=None)
Invite a user to this workspace, with the given role.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
email |
str
|
The user's email address |
required |
role |
str
|
The desired role for the user. One of {'owner', 'publisher', 'writer', 'reader'} |
required |
resend_email |
bool
|
If True, resend the invitation email if the user has already been invited to Sumatra |
True
|
app |
Optional[str]
|
The name of the app to invite the user to ('optimize' or None) |
None
|
Returns:
Type | Description |
---|---|
dict
|
A dict of the user's metadata |
list_users()
list all of the users in this workspace
Returns:
Type | Description |
---|---|
list[dict]
|
A dataframe of the users and their metadata |
materialize(timelines, features=None, start_ts=None, end_ts=None, branch=None)
Enrich collection of timelines using topology at branch. Timelines are merged based on timestamp.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
timelines |
list[str]
|
Timeline names. |
required |
features |
list[str]
|
list of features to materialize, e.g. |
None
|
start_ts |
Optional[Union[DateTime, str]]
|
Earliest event timestamp to materialize (local client timezone). |
None
|
end_ts |
Optional[Union[DateTime, str]]
|
Latest event timestamp to materialize (local client timezone). |
None
|
branch |
Optional[str]
|
Specify a source branch other than the client default. |
None
|
Returns:
Type | Description |
---|---|
Materialization
|
Handle to Materialization job |
publish_branch(branch=None)
Promote branch to LIVE.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
branch |
Optional[str]
|
Specify a branch other than the client default. |
None
|
publish_dir(scowl_dir=None, deps_file=None)
Push local scowl dir to branch and promote to LIVE.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
scowl_dir |
Optional[str]
|
Path to .scowl files. Default: |
None
|
deps_file |
Optional[str]
|
Path to deps file [default: |
None
|
publish_scowl(scowl)
Push local scowl source to branch and promote to LIVE.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
scowl |
str
|
Scowl source code as string. |
required |
query_athena(sql)
Execute a SQL query against the Athena backend and return the result as a dataframe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sql |
str
|
SQL query (e.g. "select * from event_log where event_type='login' limit 10") |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
Result of query |
refresh_openai_config()
Refresh the OpenAI model list using the existing configuration
Returns:
Type | Description |
---|---|
dict
|
OpenAI Model configuration state. |
remove_user(email)
Remove a user from this workspace
Parameters:
Name | Type | Description | Default |
---|---|---|---|
email |
str
|
The user's email address |
required |
Returns:
Type | Description |
---|---|
dict
|
A dict of the user's metadata |
replay(features, start_ts, end_ts, extra_timelines=None, branch=None)
Recompute historical feature values from LIVE event log on given topology branch.
This is the primary function of the SDK.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
features |
list[str]
|
list of features to materialize, e.g. |
required |
start_ts |
Union[DateTime, str]
|
Earliest event timestamp to materialize (local client timezone). |
required |
end_ts |
Union[DateTime, str]
|
Latest event timestamp to materialize (local client timezone). |
required |
extra_timelines |
Optional[list[str]]
|
Names of supplemental timelines. |
None
|
branch |
Optional[str]
|
Specify a source branch other than the client default. |
None
|
Returns:
Type | Description |
---|---|
Materialization
|
Handle to Materialization job |
resolve_deps(requires)
Return the resolved resources (i.e. table schemas) from the given requires statements.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
requires |
str
|
Scowl requires statement as code blob |
required |
Returns:
Type | Description |
---|---|
str
|
Resolved resource definitions (table schemas) as scowl code. |
resolve_deps_from_file(deps_file=None)
Return the resolved resources (i.e. table schemas) from the local deps.scowl
file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
deps_file |
Optional[str]
|
Path to deps file [default: ./deps.scowl] |
None
|
Returns:
Type | Description |
---|---|
str
|
Resolved resource definitions (table schemas) as scowl code. |
save_branch_to_dir(scowl_dir=None, branch=None, deps_file=None)
Save remote branch scowl files to local dir.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
scowl_dir |
Optional[str]
|
Path to save .scowl files. |
None
|
branch |
Optional[str]
|
Specify a source branch other than the client default. |
None
|
deps_file |
Optional[str]
|
Path to deps file [default: |
None
|
Returns:
Type | Description |
---|---|
str
|
Name of branch created |
save_deps(live=False, deps_file=None)
Fetch latest dependencies from server and save to file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
live |
bool
|
Return the LIVE versions of dependencies instead of latest. |
False
|
deps_file |
Optional[str]
|
Path to save deps file [default: ./deps.scowl] |
None
|
Returns:
Type | Description |
---|---|
str
|
Full path to saved dependency file. |
sdk_key()
Return the SDK key for the connected workspace
Returns:
Type | Description |
---|---|
str
|
SDK key |
set_openai_config(api_key, timeout_ms=None, retry_limit=None, max_tokens=None)
Create or update OpenAI model configuration
Parameters:
Name | Type | Description | Default |
---|---|---|---|
api_key |
str
|
OpenAI API key |
required |
timeout_ms |
int
|
Timeout in milliseconds. Default 5000 |
None
|
retry_limit |
int
|
Number of retries to perform on API error. Default 3 |
None
|
max_tokens |
int
|
Maximum number of tokens to generate in a single request. Default 8192 |
None
|
Returns:
Type | Description |
---|---|
dict
|
OpenAI Model configuration state. |
set_user_role(email, role)
Set a user's role within this workspace.
Note that the user must already be a member of the workspace. You can use invite_user
to add a new user.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
email |
str
|
The user's email address |
required |
role |
str
|
The desired role for the user. One of {'owner', 'publisher', 'writer', 'reader'} |
required |
Returns:
Type | Description |
---|---|
dict
|
A dict of the user's metadata |
update_settings(slug=None, nickname=None, billing_email=None, icon=None)
Update workspace settings metadata.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
slug |
Optional[str]
|
Desired slug of the new workspace. Must consist only of letters, numbers, '-', and '_'. If this slug is taken, a random one will be generated instead, which may be changed later. |
None
|
nickname |
Optional[str]
|
A human readable name for the new workspace |
None
|
billing_email |
Optional[str]
|
Billing email address on the account |
None
|
icon |
Optional[bytes]
|
Binary encoding of a PNG image to use as the workspace icon. Max size 50kb |
None
|
Returns:
Type | Description |
---|---|
dict
|
A dict of the updated workspace settings |
user_email()
Return the email address of the connected user.
Returns:
Type | Description |
---|---|
str
|
Email address |
version()
Return the server-side version number.
Returns:
Type | Description |
---|---|
str
|
Version identifier |