getsentry · philipphofmann · Nov 13, 2025 · Oct 14, 2025 · Oct 16, 2025 · Oct 21, 2025
diff --git a/develop-docs/sdk/telemetry/spans/batch-processor.mdx b/develop-docs/sdk/telemetry/spans/batch-processor.mdx
@@ -10,7 +10,7 @@ title: Batch Processor
   This document uses key words such as "MUST", "SHOULD", and "MAY" as defined in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) to indicate requirement levels.
 </Alert>
 
-The BatchProcessor batches spans and logs into one envelope to reduce the number of HTTP requests. When an SDK implements span streaming or logs, it MUST use a BatchProcessor, which is similar to [OpenTelemetry's Batch Processor](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md). The BatchProcessor holds logs and finished spans in memory and batches them together into envelopes. It uses a combination of time and size-based batching. When writing this, the BatchProcessor only handles spans and logs, but an SDK MAY use it for other telemetry data in the future.
+The BatchProcessor batches spans and logs into one envelope to reduce the number of HTTP requests. When an SDK implements span streaming or logs, it MUST use a BatchProcessor, which is similar to [OpenTelemetry's Batch Processor](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md). The BatchProcessor tracks logs and finished spans, allowing it to batch them together into envelopes. It uses a combination of time and size-based batching. When writing this, the BatchProcessor only handles spans and logs, but an SDK MAY use it for other telemetry data in the future.
 
 ## Specification
 
@@ -22,17 +22,16 @@ The BatchProcessor MUST send all items after the SDK when containing spans or lo
 
 When the BatchProcessor sends all spans or logs, it MUST reset its timeout and remove all spans and logs. The SDK MUST apply filtering and sampling before adding spans or logs to the BatchProcessor. The SDK MUST apply rate limits to spans and logs after they leave the BatchProcessor to send as much data as possible by dropping data as late as possible.
 
-The BatchProcessor MUST send all spans and logs in memory to avoid data loss in the following scenarios:
+The BatchProcessor MUST send all spans and logs to avoid data loss in the following scenarios:
 
 1. When the user calls `SentrySDK.flush()`, the BatchProcessor MUST send all data in memory.
 2. When the user calls  `SentrySDK.close()`, the BatchProcessor MUST send all data in memory.
 3. When the application shuts down gracefully, the BatchProcessor SHOULD send all data in memory. This is mostly relevant for mobile SDKs already subscribed to these hooks, such as [applicationWillTerminate](https://developer.apple.com/documentation/uikit/uiapplicationdelegate/applicationwillterminate(_:)) on iOS.
 4. When the application moves to the background, the BatchProcessor SHOULD send all data in memory and stop the timer. This is mostly relevant for mobile SDKs.
-5. We're working on concept for crashes, and will update the specification when we have more details.
+5. If applicable to your environment, SDKs MUST minimize data loss when sudden process terminations occur. Refer to the [Sudden Process Terminations](#sudden-process-terminations) section for more details.
 
 The detailed specification is written in the [Gherkin syntax](https://cucumber.io/docs/gherkin/reference/). The specification uses spans as an example, but the same applies to logs or any other future telemetry data.
 
-
 ```Gherkin
 Scenario: No spans in BatchProcessor 1 span added
     Given no spans in the BatchProcessor
@@ -95,3 +94,94 @@ Scenario: 1 span added application crashes
   And loses the spans in the BatchProcessor
 
 ```
+
+## Sudden Process Terminations
+
+The BatchProcessor MUST minimize the loss of logs for sudden process terminations, such as crashes or watchdog terminations.
+
+Each SDK environment is unique. Therefore, SDKs have three options to choose from to minimize data loss. As their number increases, the options get more complex. The first option is the simplest, and the last option is the most complicated. SDKs SHOULD implement the least complex option that is suitable for their environment.
+
+### 1. Flush All Data
+
+When the SDK detects a sudden process termination, it MUST put all remaining items in the BatchProcessor into one envelope and flush it. If your SDK has an offline cache, it MAY flush the envelope to disk and skip sending it to Sentry, if it ensures to send the envelope the next time the SDK starts. The BatchProcessor MUST keep its existing logic described in the [specification](#specification) above.
+
+Suppose your SDK can't reliably detect sudden process terminations, or it can't reliably flush envelopes to Sentry or disk when a sudden process termination happens. In that case, it SHOULD implement the [FileStream Cache](#2-file-stream-cache) or the [DoubleRotatingBuffer](#3-doublerotatingbuffer). It's acceptable to start with this option as a best effort interim solution before adding one of the more complex options.
+
+### 2. FileStream Cache
+
+SDKs for which blocking the main thread is a nogo, such as Android and Apple, SDKs MUST NOT implement this option. They SHOULD implement the [DoubleRotatingBuffer](#3-doublerotatingbuffer).
+
+With this option, the BatchProcessor stores the data on the calling thread directly to disk. The SDK SHOULD store the BatchProcessor files in a folder that is a sibling of the `envelopes` or `replay` folder, named `batch-processor`. This folder is scoped per DSN, so SDKs ensure not mixing up data for different DSNs. In the `batch-processor` folder, the SDK MUST store two types of cache files:
+
+- **`cache`** - The file the processor is actively writing to
+- **`flushing`** - The file being converted to an envelope and sent to Sentry
+
+When the timeout expires or the cache file hits the size limit, the BatchProcessor renames the `cache` file to `flushing`, creates a new `cache` file for incoming data, converts the data in the `flushing` file to an envelope, sends it to Sentry, and then deletes the `flushing` file. When the SDK starts again, it MUST check if there are any cache files in the cache directory (both `cache` and `flushing`) and if so, it MUST load the data from the files and send it to Sentry.
+
+
+### 3. DoubleRotatingBuffer
+
+SDKs should only consider implementing this option when options [1](#1-flush-all-data) or [2](#2-file-stream-cache) are insufficient to prevent data loss within their ecosystem. We recommend this option only if SDKs are unable to reliably detect sudden process terminations or to consistently store envelopes to disk during such terminations, as can occur with Android or Apple devices.
+
+The BatchProcessor uses two buffers to minimize data loss in the event of an abnormal process termination:
+* **Crash-Safe List**: A list stored in a crash-safe space to prevent data loss during detectable abnormal process terminations.
+* **Async IO Cache**: When a process terminates without the SDK being able to detect it, the crash-safe list loses all its elements. Therefore, the BatchProcessor uses a second buffer, the async IO cache, that stores elements to disk on a background thread to avoid blocking the calling thread, which ensures minimal data loss when such terminations occur.
+
+Furthermore, the BatchProcessor MUST prevent data loss when flushing. Therefore, it uses a double-buffering solution, meaning the two buffers alternate. The crash-safe list has two lists, and the async IO buffer has two files. When list1 is full, the BatchProcessor stores items in list2 until it successfully stores items in list1 to disk as an envelope. Then it can delete items in list1. The same applies to the IO buffer.
+
+#### BatchProcessor Files
+
+The SDK SHOULD store the BatchProcessor files in a folder that is a sibling of the `envelopes` or `replay` folder, named `batch-processor`. This folder is scoped per DSN, so SDKs ensure not mixing up data for different DSNs. The `batch-processor` folder MAY contain the following files:
+
+- `file-buffer1` and `file-buffer2` - The active IO buffers for the BatchProcessor.
+- `detected-termination-x` - The file containing items from a previous detected abnormal termination.
+- `envelope-to-flush-x` - The envelope that the BatchProcessor is about to move to the envelopes cache folder, so the SDK can send it to Sentry, where `x` is the an increasing index of the file starting from 0.
+
+
+#### Receiving Items
+
+The BatchProcessor has two lists `crash-safe-list1` and `crash-safe-list2` and two files `file-buffer1` and `file-buffer2`. When it receives items, it performs the following steps
+
+1. Put the item into the crash-safe `crash-safe-list1` on the calling thread.
+2. On a background thread, store the item in the `file-buffer1`.
+
+#### Flushing
+
+When the `crash-safe-list1` exceeds the [above described](#specification) 1MiB in size or the timeout exceeds, the BatchProcessor performs the following flushing steps:
+
+1. Store new incoming items to the `crash-safe-list2` and `file-buffer2`.
+2. Put the items of `crash-safe-list1` into an envelope named `envelope-to-flush-x`.
+3. Delete the items in `crash-safe-list1` and `file-buffer1`.
+4. Move the `envelope-to-flush-x` to the envelopes cache folder, in which all the other envelopes are stored, so the SDK can send it to Sentry.
+
+The BatchProcessor stores the `envelope-to-flush-x` not directly in the envelope cache folder because, if an abnormal process termination occurs before deleting the items `crash-safe-list1` and `file-buffer1`, the SDKs might send duplicate items.
+
+
+#### Abnormal Process Termination
+
+When SDKs detect an abnormal process termination, they MUST write the items in both `crash-safe-list1` and `crash-safe-list2` to the `detected-termination-x` file where `x` is the an increasing index of the file starting from 0.
+
+When the process terminates abnormally and the SDKs can't detect it, the SDKs lose items in the crash safe lists, which we accept over blocking the calling thread that could be the main thread.
+
+#### SDK Initialization
+
+Whenever the SDKs initialize, they must check if there is any data in the batch processor folder that needs to be recovered. SDKs MUST perform the following steps when initializing:
+
+1. If there are items in the `file-buffer1` or `file-buffer2` file, store all items into a file named `undetected-termination-x`.
+2. Create new `file-buffer1` and `file-buffer2` files and store new items to this file.
+3. Load all items from the `undetected-termination-x` and `detected-termination-x` and deduplicate them based on the IDs of the items.
+4. Put the deduplicated items into the `envelope-to-flush-x` in the batch processor cache folder.
+5. Delete the `undetected-termination-x` and `detected-termination-x` files.
+6. Move the `envelope-to-flush-x` to the envelopes cache folder.
+
+As abnormal terminations can occur at any time, there may be multiple `undetected-termination-x` and `detected-termination-x` files. SDKs MUST handle multiple file pairs at each of the above-described steps. For example, if there are two pairs of `undetected-termination-x` and `detected-termination-x`, the SDKs should perform steps 3 to 6 for both pairs.
+
+#### SDK Closes
+
+Whenever the users closes the SDK or the application terminates normally, the BatchProcessor MUST perform the steps described in the [Flushing](#flushing) section and the SDK MUST delete all items in the `file-buffer1` and `file-buffer2` files.
+
+#### Miscellaneous
+
+The BatchProcessor maintains its logic of batching multiple logs and spans together into a single envelope to avoid multiple HTTP requests.
+
+Hybrid SDKs pass every log and span down to the native SDKs, which will put every log and span in their BatchProcessor and its cache when logs and spans are ready for sending, meaning after they go through beforeLog, integrations, processors, etc.