New Relic has supported native OTLP ingest for years, during which we've identified common issues users face when troubleshooting data ingestion. Working through a support case can be time consuming and at times frustrating for customers (and for New Relic). We've put together this troubleshooting guide to help establish a shared understanding, and provide tools to self-diagnose and fix issues when possible.
First, please review New Relic OTLP configuration requirements and recommendations. It contains essential advice and context that anyone looking to use OTLP with New Relic should be aware of.
Next, see the sections below:
- General troubleshooting: Troubleshooting for general OTLP issues
- Issues catalog: Common customer issues and mitigation steps
General troubleshooting
When you encounter an issue with the New Relic OTLP endpoint, first follow these basic troubleshooting steps. If you end up opening a support ticket, these are the first things we ask:
- Enable diagnostic logs. Ensure that your OTLP client has logging enabled such that you can see details about any errors which may be occruing. The mechanism for enabling logs varies by implementation. Please consult relevant docs.
- Test that New Relic's OTLP endpoint is reachable. A simple shell command
curl http://otlp.nr-data.net
from the machine in question can help determine if some local network configuration issue (i.e. firewall restriction) is preventing connectivity to New Relic. - Check for
NrIntegrationError
events. New Relic OTLP ingest performs minimal validation synchronously before returning a success status code. If you don't see indications of export errors in your application logs, but don't see data in New Relic, try querying forNrIntegrationError
. There may be issues with your data which were detected during asynchronous validation. - Determine if the problem is localized. Often errors are localized to a specific application or environment. In these cases, it's useful to evaluate the differences between the areas which are problematic and properly functioning.
- Look for signs of invalid API key. The New Relic OTLP endoint requires setting an
api-key
header. Invalid or missing API keys are a common issue which present with HTTP 403 or 401 status codes, or gRPC Unauthenticated or PermissionDenied status codes. If you see these, check that your API key is valid and is being properly set. - Check if the export succeeds after retry. We recommend that retry is enabled, and expect export attempts to occasionally initially fail with transient errors but succeed after retrying. However, we do have an SLA. If you suspect that transient errors are frequent enough that they exceed our SLA, please open a support case.
- Check for signs that transient errors are not being retried. Despite our best efforts, there may be corner cases where the New Relic OTLP endpoint returns non-retriable status codes for transient errors. If you believe you've encountered this scenario, please open a support case.
Issue catalog
The table below catalogs issues we've seen customers encounter with the New Relic OTLP endpoint. Many are straight forward to resolve with proper configuration. The Fingerprint column shows a typical log when an application encounters the particular class of issue. See the Known resolution and Notes columns for mitigation steps.
OTLP Protocol Version | Type | Language / Ecosystem | Fingerprint | Known Resolution | Notes |
---|---|---|---|---|---|
HTTP | 401 - Unauthorized | Java | io.opentelemetry.exporter.internal.http.HttpExporter - Failed to export spans. Server responded with HTTP status code 401. | Include API Key | Missing api-key header |
HTTP | 401 - Unauthorized | Collector | Exporting failed. The error is not retryable. Dropping data. {"kind": "exporter", "data_type": "traces", "name": "otlphttp", "error": "Permanent error: error exporting items, request to https://otlp.nr-data.net/v1/traces responded with HTTP Status Code 401, Message=, Details=[]", "dropped_items": 4} | Include API Key | Missing api-key header |
HTTP | 401 - Unauthorized | Go | failed to upload metrics: failed to send metrics to https://otlp.nr-data.net/v1/metrics: 401 Unauthorized | Include API Key | Missing api-key header |
HTTP | 403 - Forbidden | Java | io.opentelemetry.exporter.internal.http.HttpExporter - Failed to export spans. Server responded with HTTP status code 403. | Verify API key | Invalid api-key header |
HTTP | 403 - Forbidden | Java | Exporting failed. The error is not retryable. Dropping data. {"kind": "exporter", "data_type": "traces", "name": "otlphttp", "error": "Permanent error: error exporting items, request to https://otlp.nr-data.net/v1/traces responded with HTTP Status Code 403, Message=, Details=[]", "dropped_items": 14} | Verify API key | Invalid api-key header |
HTTP | 403 - Forbidden | Go | traces export: failed to send to https://otlp.nr-data.net/v1/traces: 403 Forbidden | Verify API key | Invalid api-key header |
HTTP | 403 - Forbidden | .NET | Exporter failed send data to Collector to {0} endpoint. Data will not be sent. Exception: {1}{https://otlp.nr-data.net:4317/v1/traces}{System.Net.Http.HttpRequestException: Response status code does not indicate success: 403 (Forbidden). | Verify API key | Invalid api-key header |
HTTP | Timeout | Java | io.opentelemetry.exporter.internal.http.HttpExporter - Failed to export spans. The request could not be executed. Full error message: timeout | Tune batching / timeout | Occurs after export times out. Check timeout settings and client network status. If you've ruled out client side network and configuration, open support case. |
HTTP | Timeout | Collector | max elapsed time expired failed to make an HTTP request: Post \"https://otlp.nr-data.net/v1/traces\": context deadline exceeded (Client.Timeout exceeded while awaiting headers) | Tune batching / timeout | Typically occurs after retry attempts fail and export times out. Can be related to client network, client retry / timeout configuration, or New Relic network / servers. If you've ruled out client side network and configuration, open support case. |
HTTP | Timeout | Go | failed to upload metrics: context deadline exceeded: retry-able request failure | Tune batching / timeout | Occurs after export times out. Check timeout settings and client network status. If you've ruled out client side network and configuration, open support case. |
HTTP | Rate limit | Collector | Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "metrics", "name": "otlphttp", "error": "Throttle (29s), error: error exporting items, request to https://otlp.nr-data.net:443/v1/metrics responded with HTTP Status Code 429", "interval": "29s"} | Tune batching | Rate limit exceeded. Adjust batching configuration to reduce request rate. |
gRPC | Code 2 - Unknown Timeout | Java | io.opentelemetry.exporter.internal.grpc.GrpcExporter - Failed to export spans. Server responded with gRPC status code 2. Error message: timeout | Tune batching / timeout | Occurs after export times out. Check timeout settings and client network status. If you've ruled out client side network and configuration, open support case. |
gRPC | Code 2 - Unknown HTTP 500 | Collector | rpc error: code = Unknown desc = unexpected HTTP status code received from server: 500 (Internal Server Error); malformed header: missing HTTP content-type | New Relic networking vendor produced non-retriable status code for transient error. If this happens repeatedly, open support case. | |
gRPC | Code 2 - Unknown HTTP 530 | Collector | rpc error: code = Unknown desc = unexpected HTTP status code received from server: 530 (); transport: received unexpected content-type \"text/html; charset=UTF-8\" | New Relic networking vendor produced non-retriable status code for transient error. If this happens repeatedly, open support case. | |
gRPC | Code 4 - DeadlineExceeded | Collector | rpc error: code = DeadlineExceeded desc = context deadline exceeded | Tune batching / timeout | Typically occurs after retry attempts fail and export times out. Can be related to client network, client retry / timeout configuration, or New Relic network / servers. If you've ruled out client side network and configuration, open support case. |
gRPC | Code 7 - Unauthenticated | Java | io.opentelemetry.exporter.internal.grpc.GrpcExporter - Failed to export spans. Server responded with gRPC status code 7. | Include API Key | Missing api-key header |
gRPC | Code 7 - Unauthenticated | .NET | Exporter failed send data to Collector to {0} endpoint. Data will not be sent. Exception: {1}{https://otlp.nr-data.net:4317/}{Grpc.Core.RpcException: Status(StatusCode="Unauthenticated", Detail="") | Include API Key | Missing api-key header |
gRPC | Code 8 - ResourceExhausted | Collector | rpc error: code = ResourceExhausted desc = Too many requests", "dropped_items": 1024 | Tune batching | Rate limit exceeded. Adjust batching configuration to reduce request rate. |
gRPC | Code 13 - Internal | Java | io.opentelemetry.exporter.internal.grpc.GrpcExporter - Failed to export spans. Server responded with gRPC status code 13. | Not enough information to diagnose. Could be New Relic networking vendor produced non-retriable status code for a transient error. If this happens repeatedly, open a support case. | |
gRPC | Code 13 - Internal HTTP 400 | Collector | rpc error: code = Internal desc = unexpected HTTP status code received from server: 400 (Bad Request) | New Relic networking vendor produced non-retriable status code for a transient error. If this happens repeatedly, open a support case. | |
gRPC | Code 14 - Unavailable Connection reset | Collector | rpc error: code = Unavailable desc = error reading from server: read tcp 100.127.0.171:47470->162.247.241.110:4317: read: connection reset by peer | Tune retry | Should solve with retry. Ensure Collector has sufficient resources to handle retry backpressure. |
gRPC | Code 14 - Unavailable HTTP 502 | Collector | rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 502 (Bad Gateway); transport: received unexpected content-type "text/html" | Tune retry | Should solve with retry. Ensure Collector has sufficient resources to handle retry backpressure. |
gRPC | Code 14 - Unavailable HTTP 503 | Collector | rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 503 (Service Unavailable) | Tune retry | Should solve with retry. Ensure Collector has sufficient resources to handle retry backpressure. |
gRPC | Code 16 - PermissionDenied | Java | io.opentelemetry.exporter.internal.grpc.GrpcExporter - Failed to export spans. Server responded with gRPC status code 16. | Verify API key | Invalid api-key header |
gRPC | Code 16 - PermissionDenied | .NET | Exporter failed send data to Collector to {0} endpoint. Data will not be sent. Exception: {1}{https://otlp.nr-data.net:4317/}{Grpc.Core.RpcException: Status(StatusCode="PermissionDenied", Detail="") | Verify API key | Invalid api-key header |