Skip to content

0.7.0

Released: February 9, 2022

Approximate Aggregates

The cost of computing exact time-windowed aggregates, e.g. Sum, grows as that number of events falling in the time window increases. This can result in increased latency in situations such as:

  1. No by clause, e.g. Average(amount last hour). Every event writes to the aggregate, so even medium-volume streams can generate a high number of values.
  2. "Hot keys". For example, in Count(by ip last 2 hours), if the majority of traffic comes from a handful of IPs, the read-time situation is similar to the previous case.
  3. Very long time windows, e.g. CountUnique(ip by domain last 90 days). Even if the key (domain in this case) is well-behaved, very long time windows can still grow to be large.

This release introduces a library of approximate aggregates designed to address these use cases. Approximate aggregates sacrifice some accuracy for the sake of much faster computation. The great news is that, when data volumes are high, the loss in accuracy is typically very small.

Bottom Line

Continue to use exact aggregates most of the time. In specific cases when an exact aggregate is too expensive, use one these new approximations.

Decayed Aggregates

SCOWL now includes three decayed aggregate function, which use Scowl's standard time windowing (e.g. last 2 hours) to set the decay rate. Values contribute to the aggregate with an exponentially decayed weight, based on how long ago the value was seen.

HyperLogLog (beta)

To approximate the CountUnique aggregate on high-cardinality sets, SCOWL now includes the HyperLogLog approximate aggregate. This function implements the sliding window version of HyperLogLog, as described in:

"Sliding HyperLogLog: Estimating cardinality in a data stream" by Yousra Chabchoub and Georges Hébrail

Other Enhancements

  • Added not in operator, e.g. 5 not in [1, 2]
  • Added FromUnixSeconds function
  • Added CIDRBlock function
  • More lenient naming for branches and timelines (/,-,., and uppercase now allowed)

Bug Fixes

  • Branch main no longer cloned on promotion when promoted branch is main
  • GibberishNameScore returns null instead of error on empty input