Syntax
newrelic.agent.set_llm_token_count_callback(callback, application=None)
Registers a callback function which will be used for calculating token counts on Large Language Model (LLM) events.
Requirements
Python agent version 9.8.0 or higher.
Description
This API registers a callback to calculate and store token counts on LlmEmbedding
and LlmChatCompletionMessage
events.
- This function should be used when
ai_monitoring.record_content.enabled
is set tofalse
. This setting prevents the agent from sending AI content to the New Relic server, where the token counts are attached server side. - If you'd still like to capture token counts for LLM events, you can implement a callback in your app code to determine the token counts locally and send this information to New Relic.
In most cases, this API will be called exactly once, but you can make multiple calls to this API. Each fresh call made to the endpoint overwrites the previously registered callback with the new one that is provided. To unset the callback completely, pass None
in place of the original callback.
API Parameters
Parameter | Description |
---|---|
callable or None | Required. The callback to calculate token counts. To unset the current callback, pass |
object | Optional. The specific application object to associate the API call with. An application object can be obtained using the |
Return values
None.
Callback Requirements
Provided callbacks must return a positive integer token count value or no token count will be captured on LLM events.
Callback Parameters
Parameter | Description |
---|---|
string | Required. The name of the LLM model. |
string | Required. The message content/ prompt or embedding input. |
Examples
Calculating token counts and registering the callback
Example with tiktoken:
import newrelic.agentdef token_count_callback(model, content): """ Calculate token counts locally based on the model being used and the content. This callback will be invoked for each message sent or received during a LLM call. If the application supports more than one model, it may require finding libraries for each model to support token counts appropriately.
Arguments: model -- name of the LLM model content -- the LLM message content """ import tiktoken
try: enc = tiktoken.encoding_for_model(model) except KeyError: return None # Unknown model return len(enc.encode(content))
newrelic.agent.set_llm_token_count_callback(token_count_callback)
Example API usage with an application object passed in:
application = newrelic.agent.register_application(timeout=10.0) newrelic.agent.set_llm_token_count_callback(token_count_callback, application)