-
Notifications
You must be signed in to change notification settings - Fork 412
Description
Enhancement
When deployed on cloud providers, block storage is typically subject to bandwidth quotas between the storage volume and the compute instance. If read/write throughput exceeds this quota, it can result in significantly increased I/O latency for both reads and writes.
Currently, TiFlash Compute’s FileCache downloads data from S3 into local block storage as follows: it obtains an std::istream object via S3’s GetObject API and requests permission from the RateLimiter based on the total size of the entire S3 object before writing. Once the rate limiter approves the request, the full object is written to the local block storage in one burst.
tiflash/dbms/src/Storages/S3/FileCache.cpp
Lines 1010 to 1029 in 707139c
| auto temp_fname = toTemporaryFilename(local_fname); | |
| { | |
| Aws::OFStream ostr(temp_fname, std::ios_base::out | std::ios_base::binary); | |
| RUNTIME_CHECK_MSG(ostr.is_open(), "Open {} failed: {}", temp_fname, strerror(errno)); | |
| if (content_length > 0) | |
| { | |
| if (write_limiter) | |
| write_limiter->request(content_length); | |
| GET_METRIC(tiflash_storage_remote_cache_bytes, type_dtfile_download_bytes).Increment(content_length); | |
| ostr << result.GetBody().rdbuf(); | |
| // If content_length == 0, ostr.good() is false. Does not know the reason. | |
| RUNTIME_CHECK_MSG( | |
| ostr.good(), | |
| "Write {} content_length {} failed: {}", | |
| temp_fname, | |
| content_length, | |
| strerror(errno)); | |
| ostr.flush(); | |
| } | |
| } |
Although this approach may keep the average write throughput below the quota threshold—making it appear compliant—the instantaneous write rate during the actual download can temporarily exceed the quota. This burstiness can trigger throttling by the underlying block storage, leading to elevated I/O latency.
For example, fine-grained monitoring metrics from the cloud provider might show that the average throughput stays below the 470 MB/s quota, but the maximum (peak) throughput clearly exceeds this limit—indicating short bursts that violate the quota and degrade performance.
