-
Notifications
You must be signed in to change notification settings - Fork 591
feat(ts): add the support of TWA aggregator to Range and MRange #3262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Conversation
rangeCommon() in redis_timeseries.cc will add two samples, 'prev_sample' and 'next_sample' to the 'res' vector when the aggregator is twa. These two samples are used to compute the area of the polygons that are at the front of the first bucket and at the end of last bucket. prev_sample is the biggest sample in the data with timestamp less than or equal to first sample of filtered range and next_sample is the smallest sample in the data with timestamp greater than or equal to the last sample of filtered range. TODO: test TWA with FILTER_BY_TS/FILTER_BY_VALUE option.
When filtered with FILTER_BY_TS/FILTER_BY_VALUE, the `next` and `prev` samples are discarded while computing the area.
|
279c9bd will correctly calculate TWA when samples are filtered with |
The CI failed due to a clang-tidy report in the changes. Could you fix it to pass the CI? |
|
Some of these errors didn't show up when I ran |
|
Sorry for the wait! Been a bit busy these days. I'll review this later today. : ) |
some code is refactored for readability.
|
Thanks @yezhizi . I was about to push the clang-tidy fixes. |
|
| TSSample prev_sample, next_sample; | ||
| bool is_twa_aggregator = aggregator.type == TSAggregatorType::TWA, prev_available = false, next_available = false; | ||
| if (is_twa_aggregator) { | ||
| const bool discard_boundaries = !option.filter_by_ts.empty() || option.filter_by_value.has_value(); | ||
| next_sample = samples.back(); | ||
| samples.pop_back(); | ||
| prev_sample = samples.back(); | ||
| samples.pop_back(); | ||
| // When FILTER_BY_TS/FILTER_BY_VALUE is enabled, discard out-of-boundary samples. | ||
| prev_available = discard_boundaries ? false : !samples.empty() && (samples.front().ts != prev_sample.ts); | ||
| next_available = discard_boundaries ? false : !samples.empty() && (samples.back().ts != next_sample.ts); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can pass next_sample, next_available, etc. as function parameters instead of calculating them inside the AggregateSamplesByRangeOption function. For example, we can add a struct:
struct TWABounds {
std::optional<TSSample> prev_sample;
std::optional<TSSample> next_sample;
};
And modify the function interface to:
AggregateSamplesByRangeOption(std::vector<TSSample> samples, const TSRangeOption &option, const TWABounds&)
| auto non_empty_left_bucket_idx = [&spans](size_t curr) { | ||
| while (--curr && spans[curr].empty()); | ||
| return curr; | ||
| }; | ||
| auto non_empty_right_bucket_idx = [&spans](size_t curr) { | ||
| while (++curr < spans.size() && spans[curr].empty()); | ||
| return curr; | ||
| }; | ||
|
|
||
| std::vector<std::pair<TSSample, TSSample>> neighbors; | ||
| neighbors.reserve(spans.size()); | ||
| for (size_t i = 0; i < spans.size(); i++) { | ||
| TSSample prev = (i != 0) ? spans[non_empty_left_bucket_idx(i)].back() : prev_sample; | ||
| TSSample next = (i != (spans.size() - 1)) ? spans[non_empty_right_bucket_idx(i)].front() : next_sample; | ||
| neighbors.emplace_back(prev, next); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nested while loops inside the for loop result in O(N^2) time complexity, which can be optimized to O(N). We could:
- Iterate from 0 to N to resolve all prev neighbors by maintaining a "last seen non-empty" variable.
- Iterate from N to 0 to resolve all next neighbors similarly.



It closes #3217 .
rangeCommon()in redis_timeseries.cc will add two samples,prev_sampleandnext_sampleto theresvector when the aggregator is twa. These two samples are used to compute the area of the polygons that are at the front of the first bucket and at the end of last bucket.prev_sampleis the biggest sample in the data with timestamp less than or equal to first sample of filtered range andnext_sampleis the smallest sample in the data with timestamp greater than or equal to the last sample of filtered range.TODO: test TWA with FILTER_BY_TS/FILTER_BY_VALUE option.