Skip to content

0.8.0

Released: April 27, 2022

Distributed Replay

This release enables timelines of arbitrary length to be created and replayed quickly through a branch topology, supporting use cases such as large-scale model training.

Create Timeline from Log

The event log is Sumatra's durable record of every event sent to the REST API. Unlike the event feed which stores indexed recent events for low-latency querying, the event log stores raw events as line-separated JSON in S3 storage, making it more appropriate for large-scale processing.

With this release, the Python client now supports a create_timeline_from_log method, which constructs a timeline based on duration and event filter parameters.

This method replaces the now deprecated create_timeline_from_feed method.

Distributed Materialize

To support materialization of very large timelines, the Python client now includes a distributed_materialize_many method. This method is functionally similar to materialize_many with a very different implementation behind the scenes.

Distributed materialization decomposes the Scowl topology into feature subsets that can be computed in parallel. Each replay task runs as a step function, allowing for jobs to run as long as necessary to finish, without the time restrictions of a Lambda.

To help track the progress of the replay tasks, a Materialization object now supports a progress property to display "X of Y jobs completed."

The materialized results may be accessed as dataframes using get_events() and get_errors() as usual.

Feature Cache

After the first execution of a distributed replay, Sumatra populates a feature cache to make future, related replay jobs execute much faster. This is particularly helpful when running interactive experiments.

Topology Changes

If you update the Scowl code used for a distributed replay, regardless of whether you use the same branch or a different branch, distributed replay is smart enough to reuse cached feature values that have not changed.

Sumatra tracks a feature's complete dependency tree to identify the downstream effects of any Scowl code changes. For example, in c := a + b if the definition of a changes, then Sumatra recognizes that a and c will need to be recomputed but b can be retrieved from the cache.

Timeline Changes

If you change the set of timelines used in your distributed replay, Sumatra may still be able to use some cached values from a previous run.

For each stateless feature F from event type E, if all timelines that include E do not change, then the cached values of F can be completely reused, regardless of what other timelines have changed.

Similarly, for each aggregate feature F that reads in event type ER and writes in (possibly the same) event type EW, if all timelines that include either ER or EW do not change, then the cached values of F can be reused.

User-Defined Types

Scowl now supports user-defined types with the type keyword to allow for code reuse in JSON input features. See User-Defined Types for details.

This release also includes a change to how complex nulls are handled.

User Notes

Important

Old timelines may need to be re-saved.

The format of timelines has changed. As a result, attempts to do distributed replay on old timelines may result in the following error:

error: {"errorMessage":"timeline XXXXXX has not been initialized","errorType":"errorString"}

To continue to use an old timeline, open and re-save the timeline in the UI.

Administrator Notes

Important

This release requires additional steps before running terraform apply

  • In the terraform, update the sumatra version, as usual
  • In addition, update the hashicorp/aws version from 3.55.0 to 3.74.3
  • Run terraform init -upgrade
  • Finally, run terraform apply, as usual.