Exact Aggregates
Standard numerical aggregates return exact values calculated over precise trailing windows, using a combination of read-time and write-time aggregation.
When no by
key is specified, or the by
key is low-cardinality, this "hot key"
may result in execssive run-time latency. In these situations, consider using
an approximate aggregate instead.
Count
Counts the total number of events or non-null values of given feature
Syntax:
Count<event>(query)
Code Examples:
Count()
Count(post_id by account_id)
Count(by email last week)
Count(by ip, user_agent where country!="us" last 10 minutes)
Count(by email after latest_login last 2.5 hours)
Count(by ip where verdict="Block" exclusive)
Count<login>(by email_domain last 10 minutes)
Data Example:
user | ip | Count() | Count(by ip) | Count(by ip, user) |
---|---|---|---|---|
anna | 1.1.1.1 | 1 | 1 | 1 |
brad | 2.2.2.2 | 2 | 1 | 1 |
cela | 1.1.1.1 | 3 | 2 | 1 |
anna | 1.1.1.1 | 4 | 3 | 2 |
CountUnique
Counts the number of unique values for a given feature
Syntax:
CountUnique<event>(query)
Code Examples:
CountUnique(device_id by user)
CountUnique(email by ip last week)
CountUnique(ip, user_agent by device_id where is_signed_in last 2 hours)
CountUnique(sku by user after latest_billing_change exclusive)
CountUnique<login>(email by ip where verdict="Allow")
Data Example:
user | ip | CountUnique(user) | CountUnique(user by ip) |
---|---|---|---|
anna | 1.1.1.1 | 1 | 1 |
brad | 2.2.2.2 | 2 | 1 |
cela | 1.1.1.1 | 3 | 2 |
anna | 1.1.1.1 | 3 | 2 |
Sum
Computes the sum of values for a given feature
Syntax:
Sum<event>(query)
Code Examples:
Sum(amount by user)
Sum(price by device_id last week)
Sum(bad_word_count by commenter where not is_admin last week)
Sum(price by email where price > 9.99)
Sum(price by email exclusive)
Sum<purchase>(price by account_id last week)
Data Example:
price | user | Sum(price) | Sum(price by user) | Sum(price by user exclusive) |
---|---|---|---|---|
1.50 | anna | 1.50 | 1.50 | 0.00 |
2.25 | brad | 3.75 | 2.25 | 0.00 |
0.50 | cela | 4.25 | 0.50 | 0.00 |
10.00 | anna | 14.25 | 11.50 | 1.50 |
Max
Computes the maximum of values for a given feature
Syntax:
Max<event>(query)
Code Examples:
Max(amount by user)
Max(risk_score by device_id last week)
Max(price by email where risk_score > 75 last 4 hours)
Max<purchase>(price by email)
Data Example:
risk_score | user | Max(risk_score) | Max(risk_score by user) |
---|---|---|---|
68 | anna | 68 | 68 |
95 | brad | 95 | 95 |
84 | brad | 95 | 95 |
89 | anna | 95 | 89 |
Min
Computes the minimum of values for a given feature
Syntax:
Min<event>(query)
Code Examples:
Min(amount by user)
Min(risk_score by device_id last week)
Min(price by email where risk_score > 75 last 4 hours)
Min<purchase>(price by email)
Data Example:
risk_score | user | Min(risk_score) | Min(risk_score by user) |
---|---|---|---|
68 | anna | 68 | 68 |
95 | brad | 68 | 95 |
84 | brad | 68 | 84 |
89 | anna | 68 | 68 |
Average
Computes the average of values for a given feature
Syntax:
Average<event>(query)
Code Examples:
Average(amount by user)
Average(risk_score by device_id last week)
Average(price by email where risk_score > 75 last 4 hours)
Average<purchase>(price by email)
Data Example:
amount | user | Average(amount) | Average(amount by user) |
---|---|---|---|
15 | anna | 15 | 15 |
13.5 | brad | 14.25 | 13.5 |
16.5 | brad | 15 | 15 |
3 | anna | 12 | 9 |
StdDev
Computes the (population) standard deviation of values for a given feature
Syntax:
StdDev<event>(query)
Code Examples:
StdDev(amount by user)
StdDev(risk_score by device_id last week)
StdDev(price by email where risk_score > 75 last 4 hours)
StdDev<purchase>(price by email)
Data Example:
amount | user | StdDev(amount) | StdDev(amount by user) |
---|---|---|---|
15 | anna | 0 | 0 |
13.5 | brad | 0.75 | 0 |
16.5 | brad | 1.2247 | 1.5 |
3 | anna | 5.3033 | 6 |
Median
Computes the median of values for a given feature
Syntax:
Median<event>(query)
Code Examples:
Median(amount by user)
Median(risk_score by device_id last week)
Median(price by email where risk_score > 75 last 4 hours)
Median<purchase>(price by email)
Data Example:
amount | user | Median(amount) | Median(amount by user) |
---|---|---|---|
10 | anna | 10 | 10 |
3 | brad | 6.5 | 3 |
4 | brad | 4 | 3.5 |
2 | anna | 3.5 | 6 |
1 | anna | 3 | 2 |
MedianAbsoluteDeviation
Computes the median absolute deviation of values for a given feature
Syntax:
MedianAbsoluteDeviation<event>(query)
Code Examples:
MedianAbsoluteDeviation(amount by user)
MedianAbsoluteDeviation(risk_score by device_id last week)
MedianAbsoluteDeviation(price by email where risk_score > 75 last 4 hours)
MedianAbsoluteDeviation<purchase>(price by email)
Data Example:
amount | user | Median(amount by user) | MedianAbsoluteDeviation(amount by user) |
---|---|---|---|
10 | anna | 10 | 0 |
3 | anna | 6.5 | 3.5 |
8 | anna | 8 | 2 |
2 | anna | 5.5 | 3 |
9 | anna | 8 | 2 |