ML Functions
EmbedText
Vectorize a document by calling out to an embedding API (e.g. OpenAI)
Syntax:
EmbedText<integration_name>(model_name: string, text: string) -> [float]
Examples:
EmbedText<openai>(
"text-embedding-ada-002",
"hello world"
) = [0.4881309663835091, ..., 0.41235976063198154]
ModelPredict
Execute a Model on the given inputs and return a struct of output variable names to numeric values. The input and output schema are enforced at compile-time.
Syntax:
ModelPredict<model_name>(inputs: struct) -> struct
Examples:
ModelPredict<regressor>({x: 1.2, y: -7.5}) = {z: 2.8}
risk_score := ModelPredict<cash_out>({
amount,
dollars_in_out_1h,
dollars_out_by_email,
emails_per_bank,
emails_per_device
}).probability_fraud
GPTCompletion
returns a chat completion from OpenAI
Syntax:
GPTCompletion(apiToken: string, policy: string, state: string, prompt: string) -> string?
Examples:
GPTCompletion(api_token, "You are a polite AI", "We have not met", "Hello, how are you?") = "I'm doing well, thank you!"
GibberishNameScore
score a given name, username, or email handle for its likelihood of being gibberish. A negative (low) score indicates low likelihood of gibberish. A positive (high) score indicates high likelihood of gibberish. Zero is a good threshold for most use cases.
Syntax:
GibberishNameScore(name: string) -> float?
Examples:
GibberishNameScore("Heather Lowenfish") ~= -1.6666666666666667
GibberishNameScore("cris.johnson1992") ~= -1.0857142857142856
GibberishNameScore("Uisdfhkj Jasdfyb") ~= 0.7642857142857142
GibberishNameScore("jsdfhsdfjkhsdf") ~= 1.6384615384615384
SplitDocument
splits a text document into chunks that each contain fewer than chunkSize tokens
Syntax:
SplitDocument(input: string, chunkSize: int) -> [string]
Examples:
SplitDocument("Hello, how are you?", 3) = ["Hello, how", " are you?"]