Skip to content

4.1.0rc1

Pre-release
Pre-release

Choose a tag to compare

@emmettbutler emmettbutler released this 15 Dec 20:09
· 2 commits to main since this release
e408b3e

Estimated end-of-life date, accurate to within three months: 05-2027
See the support level definitions for more information.

Prelude

dd-trace-py now includes an OpenFeature provider implementation, enabling feature flag evaluation through the OpenFeature API.
This integration is under active design and development. Functionality and APIs are experimental and may change without notice.
For more information, see the Datadog documentation at https://docs.datadoghq.com/feature_flags/#overview

Upgrade Notes

  • 32-bit linux is no longer supported. Please contact us if this blocks upgrading dd-trace-py.
  • LLM Observability
    • Experiments spans now contain metadata from the dataset record.
    • Experiments spans' input, output, expected_output fields are now emitted as is so that if data in any of the columns are objects, they can be searchable in Datadog.
    • Experiments spans and children spans are now tagged with human readable names to allow better analysis of experiments data. New tags added are: dataset_name, project_name, project_id, experiment_name.
  • aioredis
    • The aioredis integration has been removed.
  • tornado
    • Updated minimum supported version to v6.1+.

Deprecation Notes

  • tornado
    • Deprecated support for Tornado versions older than v6.1. Use Tornado v6.1 or later.
    • Deprecates programmatic tracing configuration via the ddtrace.contrib.tornado module. Configure tracing using environment variables and import ddtrace.auto instead.
  • LLM Observability
    • The ExperimentResult class' rows and summary_evaluations attributes are deprecated and will be removed in the next major release. ExperimentResult.rows/summary_evaluations attributes will only store the results of the first run iteration for multi-run experiments. Use the ExperimentResult.runs attribute instead to access experiment results and summary evaluations.

New Features

  • profiling
    • Add support for threading.BoundedSemaphore locking type profiling in Python. The implementation follows the same approach as threading.Semaphore, properly handling internal lock detection to prevent double-counting of the underlying threading.Lock object.
    • Add support for threading.Semaphore locking type profiling in Python. The Lock profiler now detects and marks "internal" Lock objects, i.e. those that are part of implementation of higher-level locking types. One example of such higher-level primitive is threading.Semaphore, which is implemented with threading.Condition, which itself uses threading.Lock internally. Marking internal lock as "internal" will prevent it from being sampled, ensuring that the high-level (e.g. Semaphore) sample is processed.
    • This adds support for Python 3.14 in the Continuous Profiler.
    • This adds the process_id tag to profiles. The value of this tag is the current process ID (PID).
    • The stack sampler supports async generators and asyncio.wait.
    • Shows fully qualified name of functions using codeobject.co_qualname in memory profiler and lock profiler flamegraphs for Python 3.11+. Stack profiler has already been using this. This aligns the user experience across different profile types.
    • This introduces tracking for the asyncio.as_completed util in the Profiler.
    • This introduces tracking for asyncio.wait in the Profiler. This makes it possible to track dependencies between Tasks/Coroutines that await/are awaited through asyncio.wait.
  • AAP
    • attach Application and API Protection findings on API Gateway inferred spans to enable AppSec API Catalog coverage of lambda functions
    • This introduces proper support for API10 for redirected requests on urllib3
  • anthropic
    • Adds support for the Anthropic Beta client API (client.beta.messages.create() and client.beta.messages.stream()). This feature requires Anthropic client version 0.37.0 or higher.
  • aiokafka
    • Adds DSM instrumentation support.
    • Adds instrumentation support for aiokafka>=0.9.0. See the aiokafka<https://ddtrace.readthedocs.io/en/stable/integrations.html#aiokafka> documentation for more information.
  • Added support for uWSGI with gevent when threads are also patched. The use of the keyword argument thread=False is no longer required when performing monkey-patching with gevent via gevent.monkey.patch_all.
  • LLM Observability
    • Reasoning token counts are now captured from Google GenAI responses.
    • The OpenAI integration now captures prompt metadata (id, version, variables, and chat template) for reusable prompts when using the responses endpoint (available in OpenAI SDK >= 1.87.0).
    • Experiments can now be run multiple times by using the optional runs argument, to assess the true performance of an experiment in the face of the non determinism of LLMs. Use the new ExperimentResult class' runs attribute to access the results and summary evaluations by run iteration.
    • Non-root experiment spans are now tagged with experiment ID, run ID, and run iteration tags.
    • Adds additional tags to MCP client session and tool call spans to power LLM Observability MCP tool call features.
    • Reasoning token counts are now captured from OpenAI and OpenAI Agents responses.
    • openai
      • This introduces support for capturing server-side MCP tool calls invoked via the OpenAI Responses API as a separate span.
  • langchain
    • Adds support for tracing RunnableLambda instances.
  • mcp
    • Marks client mcp tool call spans as errors when the corresponding server tool call errored
  • Crashtracker
    • This introduces a fallback to capture runtime stack frames when Python's _Py_DumpTracebackThreads function is not available.
  • ASGI
    • Enable context propagation between websocket message spans.

Bug Fixes

  • avro
    • Fixes an issue where Avro instrumentation does not return method results when DSM is enabled.
  • crashtracker
    • Fixes missing env variables inheritance for receiver process.
  • dynamic instrumentation
    • fix issue with line probes matching the wrong source file when multiple source files from different Python path entries share the same name.
    • uploading snapshots now retries on all HTTP error codes.
  • exception replay
    • fixed the order in which frames are captured to ensure that the values of frames close to the point where the initial exception was thrown are always attached to the relevant spans.
    • fixed an infinite loop that could cause memory leaks when capturing exceptions, and improved overall speed and memory performance.
    • ensure exception information is captured when exceptions are raised by the GraphQL client library.
  • Code Security
    • Fixes critical memory safety issue in IAST when used with forked worker processes (MCP servers with Gunicorn and Uvicorn). Workers previously crashed with segmentation faults due to stale PyObject pointers in native taint maps after fork.
  • openai
    • Resolves an issue where instantiating an OpenAI client with a non-string API key resulted in parsing issues.
  • tracing
    • Fixed a potential IndexError in partial flush when the finished span counter was out of sync with actual finished spans.
    • DD_TRACE_PARTIAL_FLUSH_MIN_SPANS values less than 1 now default to 1 with a warning.
    • CI Visibility: Ensure the http connection is correctly reset in all error scenarios.
  • ray
    • This fix resolves an issue where Ray jobs that did not explicitly call ray.init() at the top of their scripts were not properly instrumented, resulting in incomplete traces. To ensure full tracing capabilities, use ddtrace-run when starting your Ray cluster: DD_PATCH_MODULES="ray:true,aiohttp:false,grpc:false,requests:false" ddtrace-run ray start --head.
  • AAP
    • This fix resolves an issue where the appsec layer was not compatible anymore with the lambda/serverless version of the tracer.
  • lib-injection
    • do not inject into the gsutil tool
  • LLM Observability
    • Fixes an issue where LLMObs.export_span() would raise when LLMObs is disabled.
    • Resolves an issue where self was being annotated as an input parameter using LLM Observability function decorators.
    • This fix resolves an issue where LLMObs.annotation_context() properties (tags, prompt, and name) were not applied to subsequent LLM operations within the same context block. This occurred when multiple sequential operations (such as Langchain batch calls with structured outputs) were performed, causing only the first operation to receive the annotations.
    • This fix resolves an issue where evaluation-metric labels containing dots could be interpreted as nested objects by adding validation that rejects such labels and provides a clear error message instructing users to use alternative naming conventions.
    • Fixes an issue where the Google ADK integration would throw an AttributeError when trying to access the name or description attributes of a tool.
  • opentelemetry
    • Fixed spans going unsampled when using opentelemetry.trace.get_current_span() or NonRecordingSpan. Spans are now kept and appear in the UI unless explicitly dropped by the Agent or sampling rules.
  • profiling
    • This fix resolves a critical issue where the Lock Profiler generated release samples for non-sampled lock acquires, resulting in inflated or negative (when integer overflows) lock hold times (e.g., "3.24k days per minute", "-970 days per minute"). This affected virtually all customers using sampling rates < 100% (which should be the majority).
    • This fix prevents a use-after-free crash from the memory profiler on Python version 3.10 and 3.11. The previous attempt to fix this bug itself had a bug, which this fix addresses.
    • improve reliability when parsing an empty span.
    • Fixes a segmentation fault caused by accessing frame.f_locals while trying to retrieve class name of a PyFrameObject.
    • This fix improves the detection of on-CPU asyncio Tasks. Previously, the Profiler would only consider a Task as running if its coroutine was running. The Profiler now recursively checks if any coroutine in the await chain of the Task's coroutine is running.
    • This fix makes stack sampling more accurate for on-CPU asyncio Tasks.
    • This fix resolves a race condition leading to incorrect stacks being reported for asyncio parent/child Tasks (e.g. when using asyncio.gather).
    • This fix resolves a possible crash coming from the experimental "fast memory copy" feature of the Stack Sampler. It occurred when the Profiler's signal handlers were replaced by other ones (from the application code).
    • This updates the stack sampler to fix a bug that would lead to OOMs when the sampler read invalid data from the Python process.
    • This fixes a bug where asyncio stacks would only get partial data, with some coroutines not showing.
    • This improves stack unwinding for asyncio workloads running Python 3.13+ by replicating the official PyGen_yf function from CPython 3.13. Previously, the sampler would use the version from an older version of CPython, which could lead to incomplete asyncio stacks.

Other Changes

  • Code Origin for Spans
    • Outgoing requests are no longer included with code origin for spans.
  • profiling
    • Moves echion, the Python stack sampler, to the ddtrace-py repository.
    • Store memalloc samples as native objects, avoiding calls into the cpython internal.