ML Functions

EmbedText

Vectorize a document by calling out to an embedding API (e.g. OpenAI)

Syntax:

EmbedText<integration_name>(model_name: string, text: string) -> [float]

Examples:

EmbedText<openai>(
  "text-embedding-ada-002",
  "hello world"
) = [0.4881309663835091, ..., 0.41235976063198154]

ModelPredict

Execute a Model on the given inputs and return a struct of output variable names to numeric values. The input and output schema are enforced at compile-time.

Syntax:

ModelPredict<model_name>(inputs: struct) -> struct

Examples:

ModelPredict<regressor>({x: 1.2, y: -7.5}) = {z: 2.8}

risk_score := ModelPredict<cash_out>({
  amount,
  dollars_in_out_1h,
  dollars_out_by_email,
  emails_per_bank,
  emails_per_device
}).probability_fraud

GPTCompletion

returns a chat completion from OpenAI

Syntax:

GPTCompletion(apiToken: string, policy: string, state: string, prompt: string) -> string?

Examples:

GPTCompletion(api_token, "You are a polite AI", "We have not met", "Hello, how are you?") = "I'm doing well, thank you!"

GibberishNameScore

score a given name, username, or email handle for its likelihood of being gibberish. A negative (low) score indicates low likelihood of gibberish. A positive (high) score indicates high likelihood of gibberish. Zero is a good threshold for most use cases.

Syntax:

GibberishNameScore(name: string) -> float?

Examples:

GibberishNameScore("Heather Lowenfish") ~= -1.6666666666666667
GibberishNameScore("cris.johnson1992") ~= -1.0857142857142856
GibberishNameScore("Uisdfhkj Jasdfyb") ~= 0.7642857142857142
GibberishNameScore("jsdfhsdfjkhsdf") ~= 1.6384615384615384

SplitDocument

splits a text document into chunks that each contain fewer than chunkSize tokens

Syntax:

SplitDocument(input: string, chunkSize: int) -> [string]

Examples:

SplitDocument("Hello, how are you?", 3) = ["Hello, how", " are you?"]