set_llm_token_count_callback (Python agent API)

Syntax

newrelic.agent.set_llm_token_count_callback(callback, application=None)

Registers a callback function which will be used for calculating token counts on Large Language Model (LLM) events.

Requirements

Python agent version 9.8.0 or higher.

Description

This API registers a callback to calculate and store token counts on LlmEmbedding and LlmChatCompletionMessage events.

This function should be used when ai_monitoring.record_content.enabled is set to false. This setting prevents the agent from sending AI content to the New Relic server, where the token counts are attached server side.
If you'd still like to capture token counts for LLM events, you can implement a callback in your app code to determine the token counts locally and send this information to New Relic.

In most cases, this API will be called exactly once, but you can make multiple calls to this API. Each fresh call made to the endpoint overwrites the previously registered callback with the new one that is provided. To unset the callback completely, pass None in place of the original callback.

API Parameters

Parameter	Description
`callback` callable or None	Required. The callback to calculate token counts. To unset the current callback, pass `None` in instead of a callback function.
`application` object	Optional. The specific application object to associate the API call with. An application object can be obtained using the `newrelic.agent.application` function.

Parameter

Description

callback

callable or None

Required. The callback to calculate token counts. To unset the current callback, pass None in instead of a callback function.

application

object

Optional. The specific application object to associate the API call with. An application object can be obtained using the newrelic.agent.application function.

Return values

None.

Callback Requirements

Provided callbacks must return a positive integer token count value or no token count will be captured on LLM events.

Callback Parameters

Parameter	Description
`model` string	Required. The name of the LLM model.
`content` string	Required. The message content/ prompt or embedding input.

Parameter

Description

model

string

Required. The name of the LLM model.

content

string

Required. The message content/ prompt or embedding input.

Examples

Calculating token counts and registering the callback

Example with tiktoken:

import newrelic.agent
def token_count_callback(model, content):
    """
    Calculate token counts locally based on the model being used and the content.
    This callback will be invoked for each message sent or received during a LLM call.
    If the application supports more than one model, it may require finding libraries for
    each model to support token counts appropriately.

    Arguments:
    model -- name of the LLM model
    content -- the LLM message content
    """
    import tiktoken

    try:
        enc = tiktoken.encoding_for_model(model)
    except KeyError:
        return None # Unknown model
    return len(enc.encode(content))

newrelic.agent.set_llm_token_count_callback(token_count_callback)

Example API usage with an application object passed in:

application = newrelic.agent.register_application(timeout=10.0)
    newrelic.agent.set_llm_token_count_callback(token_count_callback, application)

set_llm_token_count_callback (Python agent API)

Syntax .css-21sua1{background:none;border:none;width:0;padding:0;}

Requirements

Description

API Parameters

Return values

Callback Requirements

Callback Parameters

Examples

Calculating token counts and registering the callback

Syntax