diff --git a/docs/source/developer-guide/add_metrics.md b/docs/source/developer-guide/add_metrics.md new file mode 100644 index 000000000..a629944da --- /dev/null +++ b/docs/source/developer-guide/add_metrics.md @@ -0,0 +1,106 @@ +# Custom Metrics +UCM supports custom metrics with bidirectional updates from both Python and C++ runtimes. The unified monitoring interface provides the ability to mutate stats across language boundaries through a shared metrics registry. + +## Architecture Overview +The metrics consists of these components below: +- **monitor** : Central stats registry that manages all metric lifecycle operations (registration, creation, updates, queries) +- **observability.py** : Prometheus integration layer that handles metric exposition +- **metrics_config.yaml** : Declarative configuration that defines which custom metrics to register and their properties + +## Getting Started +### Define Metrics in YAML +Prometheus provides three fundamental metric types: Counter, Gauge, and Histogram. UCM implements corresponding wrappers for each type. The method for adding new metrics is as follows; please refer to the [example YAML](https://github.com/ModelEngine-Group/unified-cache-management/blob/develop/examples/metrics/metrics_configs.yaml) for more detailed information. +```yaml +log_interval: 5 # Interval in seconds for logging metrics + +prometheus: + multiproc_dir: "/vllm-workspace" # Directory for Prometheus multiprocess mode + + metric_prefix: "ucm:" + + # Enable/disable metrics by category + enabled_metrics: + counters: true + gauges: true + histograms: true + + # Counter metrics configuration + counters: + - name: "received_requests" + documentation: "Total number of requests sent to ucm" + + # Gauge metrics configuration + gauges: + - name: "external_lookup_hit_rate" + documentation: "Hit rate of ucm lookup requests" + multiprocess_mode: "livemostrecent" + + # Histogram metrics configuration + histograms: + - name: "load_requests_num" + documentation: "Number of requests loaded from ucm" + buckets: [1, 5, 10, 20, 50, 100, 200, 500, 1000] +``` + +### Use Monitor APIs to Update Stats +The monitor provides a unified interface for metric operations. Users only need to create stats and update them, while the observability component is responsible for fetching the stats and pushing them to Prometheus. +:::::{tab-set} +:sync-group: install + +::::{tab-item} Python side interfaces +:selected: +:sync: py +**Lifecycle Methods** +- `create_stats(name)`: Create and initialize a registered stats object. + +**Operation Methods** +- `update_stats(name, dict)`: Update specific fields of a specific stats object. +- `get_stats(name)`: Retrieve current values of a specific stats object. +- `get_stats_and_clear(name)`: Retrieve and reset a specific stats object. +- `get_all_stats_and_clear()`: Retrieve and reset all stats objects. +- `reset_stats(name)`: Reset a specific stats object to initial state. +- `reset_all()`: Reset all stats registered in monitor. + +**Example:** Using built-in ConnStats +```python +from ucm.shared.metrics import ucmmonitor + +ucmmonitor.create_stats("ConnStats") # Create a stats obj + +# Update stats +ucmmonitor.update_stats( + "ConnStats", + {"interval_lookup_hit_rates": external_hit_blocks / len(ucm_block_ids)}, +) + +``` +See more detailed example in [test case](https://github.com/ModelEngine-Group/unified-cache-management/tree/develop/ucm/shared/test/example). + +:::: + +::::{tab-item} C++ side interfaces +:sync: cc +**Lifecycle Methods** +- `CreateStats(const std::string& name)`: Create and initialize a registered stats object. + +**Operation Methods** +- `UpdateStats(const std::string& name, const std::unordered_map& params)`: Update specific fields of a specific stats object. +- `ResetStats(const std::string& name)`: Retrieve current values of a specific stats object. +- `ResetAllStats()`: Retrieve and reset a specific stats object. +- `GetStats(const std::string& name)`: Retrieve and reset all stats objects. +- `GetStatsAndClear(const std::string& name)`: Reset a specific stats object to initial state. +- `GetAllStatsAndClear()`: Reset all stats registered in monitor. + +**Example:** Implementing custom stats in C++ +UCM supports custom metrics by following steps: +- Step 1: linking the static library monitor_static + ```c++ + target_link_libraries(xxxstore PUBLIC storeinfra monitor_static) + ``` +- Step 2: Create stats object using function **CreateStats** +- Step 3: Update using function **UpdateStats** + +See more detailed example in [test case](https://github.com/ModelEngine-Group/unified-cache-management/tree/develop/ucm/shared/test/case). + +:::: +::::: \ No newline at end of file diff --git a/docs/source/index.md b/docs/source/index.md index b494115bb..d7b441431 100644 --- a/docs/source/index.md +++ b/docs/source/index.md @@ -63,6 +63,7 @@ user-guide/metrics/metrics :caption: Developer Guide :maxdepth: 1 developer-guide/contribute +developer-guide/add_metrics ::: :::{toctree} diff --git a/ucm/integration/vllm/ucm_connector.py b/ucm/integration/vllm/ucm_connector.py index 66216a255..69f30df39 100644 --- a/ucm/integration/vllm/ucm_connector.py +++ b/ucm/integration/vllm/ucm_connector.py @@ -18,8 +18,8 @@ from vllm.v1.core.sched.output import SchedulerOutput from ucm.logger import init_logger +from ucm.observability import UCMStatsLogger from ucm.shared.metrics import ucmmonitor -from ucm.shared.metrics.observability import UCMStatsLogger from ucm.store.factory import UcmConnectorFactory from ucm.store.ucmstore import Task, UcmKVStoreBase from ucm.utils import Config @@ -172,12 +172,12 @@ def __init__(self, vllm_config: "VllmConfig", role: KVConnectorRole): self.metrics_config = self.launch_config.get("metrics_config_path", "") if self.metrics_config: + ucmmonitor.create_stats("ConnStats") self.stats_logger = UCMStatsLogger( vllm_config.model_config.served_model_name, self.global_rank, self.metrics_config, ) - self.monitor = ucmmonitor.StatsMonitor.get_instance() self.synchronize = ( torch.cuda.synchronize @@ -236,7 +236,7 @@ def get_num_new_matched_tokens( f"hit external: {external_hit_blocks}" ) if self.metrics_config: - self.monitor.update_stats( + ucmmonitor.update_stats( "ConnStats", {"interval_lookup_hit_rates": external_hit_blocks / len(ucm_block_ids)}, ) @@ -532,7 +532,7 @@ def start_load_kv(self, forward_context: "ForwardContext", **kwargs) -> None: / 1024 ) # GB/s if self.metrics_config and is_load: - self.monitor.update_stats( + ucmmonitor.update_stats( "ConnStats", { "load_requests_num": num_loaded_request, @@ -622,7 +622,7 @@ def wait_for_save(self) -> None: / 1024 ) # GB/s if self.metrics_config and is_save: - self.monitor.update_stats( + ucmmonitor.update_stats( "ConnStats", { "save_requests_num": num_saved_request, diff --git a/ucm/shared/metrics/observability.py b/ucm/observability.py similarity index 75% rename from ucm/shared/metrics/observability.py rename to ucm/observability.py index fb33400ce..dd195a1cf 100644 --- a/ucm/shared/metrics/observability.py +++ b/ucm/observability.py @@ -21,22 +21,17 @@ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # - +# Be impressed by https://github.com/LMCache/LMCache/blob/dev/lmcache/observability.py import os import threading import time from dataclasses import dataclass -from pathlib import Path -from typing import Any, Dict, List, Optional, Union +from typing import Any, List import prometheus_client import yaml -# Third Party -from prometheus_client import REGISTRY -from vllm.distributed.parallel_state import get_world_group - from ucm.logger import init_logger from ucm.shared.metrics import ucmmonitor @@ -56,7 +51,7 @@ class PrometheusLogger: _counter_cls = prometheus_client.Counter _histogram_cls = prometheus_client.Histogram - def __init__(self, metadata: UCMEngineMetadata, config: Dict[str, Any]): + def __init__(self, metadata: UCMEngineMetadata, config: dict[str, Any]): # Ensure PROMETHEUS_MULTIPROC_DIR is set before any metric registration prometheus_config = config.get("prometheus", {}) multiproc_dir = prometheus_config.get("multiproc_dir", "/vllm-workspace") @@ -74,7 +69,7 @@ def __init__(self, metadata: UCMEngineMetadata, config: Dict[str, Any]): self._init_metrics_from_config(labelnames, prometheus_config) def _init_metrics_from_config( - self, labelnames: List[str], prometheus_config: Dict[str, Any] + self, labelnames: List[str], prometheus_config: dict[str, Any] ): """Initialize metrics based on configuration""" enabled = prometheus_config.get("enabled_metrics", {}) @@ -85,7 +80,7 @@ def _init_metrics_from_config( # Store metric mapping: metric_name -> (metric_type, attribute_name, stats_field_name) # This mapping will be used in log_prometheus to dynamically log metrics - self.metric_mappings: Dict[str, Dict[str, str]] = {} + self.metric_mappings: dict[str, dict[str, str]] = {} # Initialize counters if enabled.get("counters", True): @@ -172,49 +167,46 @@ def _init_metrics_from_config( "attr": attr_name, } - def _log_gauge(self, gauge, data: Union[int, float]) -> None: + def _set_gauge(self, gauge, data: List) -> None: # Convenience function for logging to gauge. + if not data: + return gauge.labels(**self.labels).set(data) - def _log_counter(self, counter, data: Union[int, float]) -> None: + def _inc_counter(self, counter, data: List) -> None: # Convenience function for logging to counter. # Prevent ValueError from negative increment - if data < 0: - return - counter.labels(**self.labels).inc(data) + counter.labels(**self.labels).inc(sum(data)) - def _log_histogram(self, histogram, data: Union[List[int], List[float]]) -> None: + def _observe_histogram(self, histogram, data: List) -> None: # Convenience function for logging to histogram. for value in data: histogram.labels(**self.labels).observe(value) - def log_prometheus(self, stats: Any): + def update_stats(self, stats: dict[str, List]): """Log metrics to Prometheus based on configuration file""" # Dynamically log metrics based on what's configured in YAML for stat_name, value in stats.items(): try: metric_mapped = self.metric_mappings[stat_name] if metric_mapped is None: - logger.warning(f"Stat {stat_name} not initialized.") + logger.debug(f"Stat {stat_name} not initialized.") continue metric_obj = getattr(self, metric_mapped["attr"], None) metric_type = metric_mapped["type"] # Log based on metric type if metric_type == "counter": - self._log_counter(metric_obj, value) + self._set_gauge(metric_obj, value) elif metric_type == "gauge": - self._log_gauge(metric_obj, value) + self._inc_counter(metric_obj, value) elif metric_type == "histogram": # Histograms expect a list - if not isinstance(value, list): - if value: - value = [value] - else: - value = [] - self._log_histogram(metric_obj, value) - except Exception as e: - logger.warning(f"Failed to log metric {stat_name}: {e}") + self._observe_histogram(metric_obj, value) + else: + logger.error(f"Not found metric type for {stat_name}") + except Exception: + logger.debug(f"Failed to log metric {stat_name}") @staticmethod def _metadata_to_labels(metadata: UCMEngineMetadata): @@ -223,40 +215,6 @@ def _metadata_to_labels(metadata: UCMEngineMetadata): "worker_id": metadata.worker_id, } - _instance = None - - @staticmethod - def GetOrCreate( - metadata: UCMEngineMetadata, - config_path: str = "", - ) -> "PrometheusLogger": - if PrometheusLogger._instance is None: - PrometheusLogger._instance = PrometheusLogger(metadata, config_path) - # assert PrometheusLogger._instance.metadata == metadata, \ - # "PrometheusLogger instance already created with different metadata" - if PrometheusLogger._instance.metadata != metadata: - logger.error( - "PrometheusLogger instance already created with" - "different metadata. This should not happen except " - "in test" - ) - return PrometheusLogger._instance - - @staticmethod - def GetInstance() -> "PrometheusLogger": - assert ( - PrometheusLogger._instance is not None - ), "PrometheusLogger instance not created yet" - return PrometheusLogger._instance - - @staticmethod - def GetInstanceOrNone() -> Optional["PrometheusLogger"]: - """ - Returns the singleton instance of PrometheusLogger if it exists, - otherwise returns None. - """ - return PrometheusLogger._instance - class UCMStatsLogger: def __init__(self, model_name: str, rank: int, config_path: str = ""): @@ -267,15 +225,13 @@ def __init__(self, model_name: str, rank: int, config_path: str = ""): # Load configuration config = self._load_config(config_path) self.log_interval = config.get("log_interval", 10) - - self.monitor = ucmmonitor.StatsMonitor.get_instance() - self.prometheus_logger = PrometheusLogger.GetOrCreate(self.metadata, config) + self.prometheus_logger = PrometheusLogger(self.metadata, config) self.is_running = True self.thread = threading.Thread(target=self.log_worker, daemon=True) self.thread.start() - def _load_config(self, config_path: str) -> Dict[str, Any]: + def _load_config(self, config_path: str) -> dict[str, Any]: """Load configuration from YAML file""" try: with open(config_path, "r") as f: @@ -295,11 +251,16 @@ def _load_config(self, config_path: str) -> Dict[str, Any]: def log_worker(self): while self.is_running: - # Use UCMStatsMonitor.get_states_and_clear() from external import - stats = self.monitor.get_stats_and_clear("ConnStats") - self.prometheus_logger.log_prometheus(stats) + stats = ucmmonitor.get_all_stats_and_clear().data + self.prometheus_logger.update_stats(stats) time.sleep(self.log_interval) def shutdown(self): self.is_running = False self.thread.join() + + def __del__(self): + try: + self.shutdown() + except Exception: + pass diff --git a/ucm/shared/metrics/CMakeLists.txt b/ucm/shared/metrics/CMakeLists.txt index 585e6a547..37dc054a8 100644 --- a/ucm/shared/metrics/CMakeLists.txt +++ b/ucm/shared/metrics/CMakeLists.txt @@ -1,17 +1,34 @@ -file(GLOB_RECURSE CORE_SRCS CONFIGURE_DEPENDS - "${CMAKE_CURRENT_SOURCE_DIR}/cc/stats/*.cc" - "${CMAKE_CURRENT_SOURCE_DIR}/cc/*.cc") -add_library(monitor_static STATIC ${CORE_SRCS}) +file(GLOB_RECURSE DOMAIN_SRCS + "${CMAKE_CURRENT_SOURCE_DIR}/cc/domain/*.cc" + "${CMAKE_CURRENT_SOURCE_DIR}/cc/domain/stats/*.cc" +) + +file(GLOB_RECURSE API_SRCS + "${CMAKE_CURRENT_SOURCE_DIR}/cc/api/*.cc" +) + +add_library(monitor_static STATIC + ${DOMAIN_SRCS} + ${API_SRCS} +) + set_property(TARGET monitor_static PROPERTY POSITION_INDEPENDENT_CODE ON) + target_include_directories(monitor_static PUBLIC - $ - $) + $ + $ + $ +) + set_target_properties(monitor_static PROPERTIES OUTPUT_NAME monitor) file(GLOB_RECURSE BINDINGS_SRCS CONFIGURE_DEPENDS "${CMAKE_CURRENT_SOURCE_DIR}/cpy/*.cc") pybind11_add_module(ucmmonitor ${BINDINGS_SRCS}) -target_link_libraries(ucmmonitor PRIVATE -Wl,--whole-archive monitor_static -Wl,--no-whole-archive) -target_include_directories(ucmmonitor PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/cc) +target_link_libraries(ucmmonitor PRIVATE monitor_static) + +target_include_directories(ucmmonitor PRIVATE + "${CMAKE_CURRENT_SOURCE_DIR}/cc/api" +) file(RELATIVE_PATH INSTALL_REL_PATH ${CMAKE_SOURCE_DIR} diff --git a/ucm/shared/metrics/cc/stats_registry.cc b/ucm/shared/metrics/cc/api/stats_monitor_api.cc similarity index 57% rename from ucm/shared/metrics/cc/stats_registry.cc rename to ucm/shared/metrics/cc/api/stats_monitor_api.cc index c2551d9ad..f2f7c7aab 100644 --- a/ucm/shared/metrics/cc/stats_registry.cc +++ b/ucm/shared/metrics/cc/api/stats_monitor_api.cc @@ -21,39 +21,39 @@ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ -#include "stats_registry.h" - +#include "stats_monitor_api.h" namespace UC::Metrics { -StatsRegistry& StatsRegistry::GetInstance() +void CreateStats(const std::string& name) { StatsMonitor::GetInstance().CreateStats(name); } + +void UpdateStats(const std::string& name, const std::unordered_map& params) { - static StatsRegistry inst; - return inst; + StatsMonitor::GetInstance().UpdateStats(name, params); } -void StatsRegistry::RegisterStats(std::string name, Creator creator) +void ResetStats(const std::string& name) { StatsMonitor::GetInstance().ResetStats(name); } + +void ResetAllStats() { StatsMonitor::GetInstance().ResetAllStats(); } + +StatsResult GetStats(const std::string& name) { - auto& reg = GetInstance(); - std::lock_guard lk(reg.mutex_); - reg.registry_[name] = creator; + StatsResult result; + result.data = StatsMonitor::GetInstance().GetStats(name); + return result; } -std::unique_ptr StatsRegistry::CreateStats(const std::string& name) +StatsResult GetStatsAndClear(const std::string& name) { - auto& reg = GetInstance(); - std::lock_guard lk(reg.mutex_); - if (auto it = reg.registry_.find(name); it != reg.registry_.end()) return it->second(); - return nullptr; + StatsResult result; + result.data = StatsMonitor::GetInstance().GetStatsAndClear(name); + return result; } -std::vector StatsRegistry::GetRegisteredStatsNames() +StatsResult GetAllStatsAndClear() { - auto& reg = GetInstance(); - std::lock_guard lk(reg.mutex_); - std::vector names; - names.reserve(reg.registry_.size()); - for (auto& [n, _] : reg.registry_) names.push_back(n); - return names; + StatsResult result; + result.data = StatsMonitor::GetInstance().GetAllStatsAndClear(); + return result; } } // namespace UC::Metrics \ No newline at end of file diff --git a/ucm/shared/metrics/cc/stats_registry.h b/ucm/shared/metrics/cc/api/stats_monitor_api.h similarity index 61% rename from ucm/shared/metrics/cc/stats_registry.h rename to ucm/shared/metrics/cc/api/stats_monitor_api.h index c22b6617c..a77115d6f 100644 --- a/ucm/shared/metrics/cc/stats_registry.h +++ b/ucm/shared/metrics/cc/api/stats_monitor_api.h @@ -21,38 +21,23 @@ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ -#ifndef UNIFIEDCACHE_REGISTRY_H -#define UNIFIEDCACHE_REGISTRY_H - -#include -#include -#include -#include "stats/istats.h" +#ifndef UNIFIEDCACHE_MONITOR_API_H +#define UNIFIEDCACHE_MONITOR_API_H +#include "stats_monitor.h" namespace UC::Metrics { - -using Creator = std::unique_ptr (*)(); - -class StatsRegistry { -public: - static StatsRegistry& GetInstance(); - - static void RegisterStats(std::string name, Creator creator); - - std::unique_ptr CreateStats(const std::string& name); - - std::vector GetRegisteredStatsNames(); - -private: - StatsRegistry() = default; - ~StatsRegistry() = default; - StatsRegistry(const StatsRegistry&) = delete; - StatsRegistry& operator=(const StatsRegistry&) = delete; - - std::mutex mutex_; - std::unordered_map registry_; +struct StatsResult { + StatsResult() = default; + std::unordered_map> data; }; -} // namespace UC::Metrics +void CreateStats(const std::string& name); +void UpdateStats(const std::string& name, const std::unordered_map& params); +void ResetStats(const std::string& name); +void ResetAllStats(); +StatsResult GetStats(const std::string& name); +StatsResult GetStatsAndClear(const std::string& name); +StatsResult GetAllStatsAndClear(); -#endif // UNIFIEDCACHE_REGISTRY_H \ No newline at end of file +} // namespace UC::Metrics +#endif \ No newline at end of file diff --git a/ucm/shared/metrics/cc/stats/istats.h b/ucm/shared/metrics/cc/domain/stats/stats.h similarity index 70% rename from ucm/shared/metrics/cc/stats/istats.h rename to ucm/shared/metrics/cc/domain/stats/stats.h index 6e8de7b32..a535ed64c 100644 --- a/ucm/shared/metrics/cc/stats/istats.h +++ b/ucm/shared/metrics/cc/domain/stats/stats.h @@ -21,23 +21,29 @@ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ -#ifndef UNIFIEDCACHE_ISTATS_H -#define UNIFIEDCACHE_ISTATS_H +#ifndef UNIFIEDCACHE_STATS_H +#define UNIFIEDCACHE_STATS_H -#include #include #include #include namespace UC::Metrics { -class IStats { +class Stats { public: - virtual ~IStats() = default; - virtual std::string Name() const = 0; - virtual void Update(const std::unordered_map& params) = 0; - virtual void Reset() = 0; - virtual std::unordered_map> Data() = 0; + explicit Stats(const std::string& name) : name_(name) {} + std::string Name() { return name_; } + void Update(const std::unordered_map& params) + { + for (const auto& [key, val] : params) { data_[key].push_back(val); } + } + void Reset() { data_.clear(); } + std::unordered_map> Data() { return data_; } + +private: + std::string name_; + std::unordered_map> data_; }; } // namespace UC::Metrics diff --git a/ucm/shared/metrics/cc/stats_monitor.cc b/ucm/shared/metrics/cc/domain/stats_monitor.cc similarity index 64% rename from ucm/shared/metrics/cc/stats_monitor.cc rename to ucm/shared/metrics/cc/domain/stats_monitor.cc index 2d3d80266..03a8dc2ea 100644 --- a/ucm/shared/metrics/cc/stats_monitor.cc +++ b/ucm/shared/metrics/cc/domain/stats_monitor.cc @@ -22,61 +22,76 @@ * SOFTWARE. * */ #include "stats_monitor.h" -#include -#include -#include "stats/istats.h" -#include "stats_registry.h" namespace UC::Metrics { -StatsMonitor::StatsMonitor() -{ - auto& registry = StatsRegistry::GetInstance(); - for (const auto& name : registry.GetRegisteredStatsNames()) { - stats_map_[name] = registry.CreateStats(name); - } -} - void StatsMonitor::CreateStats(const std::string& name) { std::lock_guard lock(mutex_); - auto& registry = StatsRegistry::GetInstance(); - stats_map_[name] = registry.CreateStats(name); + stats_map_[name] = std::make_unique(name); } -std::unordered_map> StatsMonitor::GetStats(const std::string& name) +void StatsMonitor::UpdateStats(const std::string& name, + const std::unordered_map& params) { std::lock_guard lock(mutex_); - return stats_map_[name]->Data(); + auto it = stats_map_.find(name); + if (it == stats_map_.end() || !it->second) { return; } + try { + it->second->Update(params); + } catch (...) { + return; + } } -void StatsMonitor::ResetStats(const std::string& name) +std::unordered_map> StatsMonitor::GetStats(const std::string& name) { std::lock_guard lock(mutex_); - stats_map_[name]->Reset(); + auto it = stats_map_.find(name); + if (it == stats_map_.end() || !it->second) { return {}; } + return it->second->Data(); } std::unordered_map> StatsMonitor::GetStatsAndClear(const std::string& name) { std::lock_guard lock(mutex_); - auto result = stats_map_[name]->Data(); - stats_map_[name]->Reset(); + auto it = stats_map_.find(name); + if (it == stats_map_.end() || !it->second) { return {}; } + auto result = it->second->Data(); + it->second->Reset(); return result; } -void StatsMonitor::UpdateStats(const std::string& name, - const std::unordered_map& params) +std::unordered_map> StatsMonitor::GetAllStatsAndClear() +{ + std::lock_guard lock(mutex_); + std::unordered_map> all_stats; + + for (const auto& [name, stats_ptr] : stats_map_) { + if (stats_ptr) { + auto data = stats_ptr->Data(); + all_stats.insert(data.begin(), data.end()); + stats_ptr->Reset(); + } + } + return all_stats; +} + +void StatsMonitor::ResetStats(const std::string& name) { std::lock_guard lock(mutex_); auto it = stats_map_.find(name); - if (it != stats_map_.end()) { it->second->Update(params); } + if (it == stats_map_.end() || !it->second) { return; } + it->second->Reset(); } void StatsMonitor::ResetAllStats() { std::lock_guard lock(mutex_); - for (auto& [n, ptr] : stats_map_) { ptr->Reset(); } + for (const auto& [name, stats_ptr] : stats_map_) { + if (stats_ptr) { stats_ptr->Reset(); } + } } } // namespace UC::Metrics \ No newline at end of file diff --git a/ucm/shared/metrics/cc/stats_monitor.h b/ucm/shared/metrics/cc/domain/stats_monitor.h similarity index 91% rename from ucm/shared/metrics/cc/stats_monitor.h rename to ucm/shared/metrics/cc/domain/stats_monitor.h index 1545d4b5c..70d10d46f 100644 --- a/ucm/shared/metrics/cc/stats_monitor.h +++ b/ucm/shared/metrics/cc/domain/stats_monitor.h @@ -29,7 +29,7 @@ #include #include #include -#include "stats/istats.h" +#include "stats/stats.h" namespace UC::Metrics { @@ -45,26 +45,27 @@ class StatsMonitor { void CreateStats(const std::string& name); - std::unordered_map> GetStats(const std::string& name); + void UpdateStats(const std::string& name, + const std::unordered_map& params); - void ResetStats(const std::string& name); + std::unordered_map> GetStats(const std::string& name); std::unordered_map> GetStatsAndClear(const std::string& name); - void UpdateStats(const std::string& name, - const std::unordered_map& params); + std::unordered_map> GetAllStatsAndClear(); + + void ResetStats(const std::string& name); void ResetAllStats(); private: std::mutex mutex_; - std::unordered_map> stats_map_; + std::unordered_map> stats_map_; - StatsMonitor(); + StatsMonitor() = default; StatsMonitor(const StatsMonitor&) = delete; StatsMonitor& operator=(const StatsMonitor&) = delete; }; - } // namespace UC::Metrics #endif // UNIFIEDCACHE_MONITOR_H \ No newline at end of file diff --git a/ucm/shared/metrics/cc/stats/conn_stats.cc b/ucm/shared/metrics/cc/stats/conn_stats.cc deleted file mode 100644 index edf18ac2e..000000000 --- a/ucm/shared/metrics/cc/stats/conn_stats.cc +++ /dev/null @@ -1,83 +0,0 @@ -/** - * MIT License - * - * Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved. - * - * Permission is hereby granted, free of charge, to any person obtaining a copy - * of this software and associated documentation files (the "Software"), to deal - * in the Software without restriction, including without limitation the rights - * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell - * copies of the Software, and to permit persons to whom the Software is - * furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice shall be included in all - * copies or substantial portions of the Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE - * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * */ -#include "conn_stats.h" - -namespace UC::Metrics { - -ConnStats::ConnStats() = default; - -std::string ConnStats::Name() const { return "ConnStats"; } - -void ConnStats::Reset() -{ - for (auto& v : data_) v.clear(); -} - -void ConnStats::Update(const std::unordered_map& params) -{ - for (const auto& [k, v] : params) { - Key id = KeyFromString(k); - if (id == Key::COUNT) continue; - EmplaceBack(id, v); - } -} - -std::unordered_map> ConnStats::Data() -{ - std::unordered_map> result; - result["save_requests_num"] = data_[static_cast(Key::save_requests_num)]; - result["save_blocks_num"] = data_[static_cast(Key::save_blocks_num)]; - result["save_duration"] = data_[static_cast(Key::save_duration)]; - result["save_speed"] = data_[static_cast(Key::save_speed)]; - result["load_requests_num"] = data_[static_cast(Key::load_requests_num)]; - result["load_blocks_num"] = data_[static_cast(Key::load_blocks_num)]; - result["load_duration"] = data_[static_cast(Key::load_duration)]; - result["load_speed"] = data_[static_cast(Key::load_speed)]; - result["interval_lookup_hit_rates"] = - data_[static_cast(Key::interval_lookup_hit_rates)]; - return result; -} - -Key ConnStats::KeyFromString(const std::string& k) -{ - if (k == "save_requests_num") return Key::save_requests_num; - if (k == "save_blocks_num") return Key::save_blocks_num; - if (k == "save_duration") return Key::save_duration; - if (k == "save_speed") return Key::save_speed; - if (k == "load_requests_num") return Key::load_requests_num; - if (k == "load_blocks_num") return Key::load_blocks_num; - if (k == "load_duration") return Key::load_duration; - if (k == "load_speed") return Key::load_speed; - if (k == "interval_lookup_hit_rates") return Key::interval_lookup_hit_rates; - return Key::COUNT; -} - -void ConnStats::EmplaceBack(Key id, double value) -{ - data_[static_cast(id)].push_back(value); -} - -static Registrar registrar; - -} // namespace UC::Metrics \ No newline at end of file diff --git a/ucm/shared/metrics/cc/stats/conn_stats.h b/ucm/shared/metrics/cc/stats/conn_stats.h deleted file mode 100644 index e8cc94559..000000000 --- a/ucm/shared/metrics/cc/stats/conn_stats.h +++ /dev/null @@ -1,78 +0,0 @@ -/** - * MIT License - * - * Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved. - * - * Permission is hereby granted, free of charge, to any person obtaining a copy - * of this software and associated documentation files (the "Software"), to deal - * in the Software without restriction, including without limitation the rights - * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell - * copies of the Software, and to permit persons to whom the Software is - * furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice shall be included in all - * copies or substantial portions of the Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE - * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * */ -#ifndef UNIFIEDCACHE_CONNSTATS_H -#define UNIFIEDCACHE_CONNSTATS_H - -#include -#include -#include -#include -#include -#include "istats.h" -#include "stats_registry.h" - -namespace UC::Metrics { - -enum class Key : uint8_t { - interval_lookup_hit_rates = 0, - save_requests_num, - save_blocks_num, - save_duration, - save_speed, - load_requests_num, - load_blocks_num, - load_duration, - load_speed, - COUNT -}; - -class ConnStats : public IStats { -public: - ConnStats(); - ~ConnStats() = default; - - std::string Name() const override; - void Reset() override; - void Update(const std::unordered_map& params) override; - std::unordered_map> Data() override; - -private: - static constexpr std::size_t N = static_cast(Key::COUNT); - std::array, N> data_; - - static Key KeyFromString(const std::string& k); - void EmplaceBack(Key id, double value); -}; - -struct Registrar { - Registrar() - { - StatsRegistry::RegisterStats( - "ConnStats", []() -> std::unique_ptr { return std::make_unique(); }); - } -}; - -} // namespace UC::Metrics - -#endif // UNIFIEDCACHE_CONNSTATS_H \ No newline at end of file diff --git a/ucm/shared/metrics/cpy/metrics.py.cc b/ucm/shared/metrics/cpy/metrics.py.cc index 10bfc2f97..085f74033 100644 --- a/ucm/shared/metrics/cpy/metrics.py.cc +++ b/ucm/shared/metrics/cpy/metrics.py.cc @@ -23,19 +23,23 @@ * */ #include #include -#include "stats_monitor.h" +#include "stats_monitor_api.h" namespace py = pybind11; namespace UC::Metrics { void bind_monitor(py::module_& m) { - py::class_(m, "StatsMonitor") - .def_static("get_instance", &StatsMonitor::GetInstance, py::return_value_policy::reference) - .def("update_stats", &StatsMonitor::UpdateStats) - .def("reset_all", &StatsMonitor::ResetAllStats) - .def("get_stats", &StatsMonitor::GetStats) - .def("get_stats_and_clear", &StatsMonitor::GetStatsAndClear); + py::class_(m, "StatsResult") + .def(py::init<>()) + .def_readonly("data", &StatsResult::data); + m.def("create_stats", &CreateStats); + m.def("update_stats", &UpdateStats); + m.def("get_stats", &GetStats); + m.def("get_stats_and_clear", &GetStatsAndClear); + m.def("get_all_stats_and_clear", &GetAllStatsAndClear); + m.def("reset_stats", &ResetStats); + m.def("reset_all", &ResetAllStats); } } // namespace UC::Metrics diff --git a/ucm/shared/metrics/test/test.py b/ucm/shared/metrics/test/test.py deleted file mode 100644 index 246e6f880..000000000 --- a/ucm/shared/metrics/test/test.py +++ /dev/null @@ -1,58 +0,0 @@ -# -*- coding: utf-8 -*- -# -# MIT License -# -# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved. -# -# Permission is hereby granted, free of charge, to any person obtaining a copy -# of this software and associated documentation files (the "Software"), to deal -# in the Software without restriction, including without limitation the rights -# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -# copies of the Software, and to permit persons to whom the Software is -# furnished to do so, subject to the following conditions: -# -# The above copyright notice and this permission notice shall be included in all -# copies or substantial portions of the Software. -# -# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -# SOFTWARE. -# - - -import os -import sys - -sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) -from ucm.shared.metrics import ucmmonitor - -# import monitor - -mon = ucmmonitor.StatsMonitor.get_instance() -mon.update_stats( - "ConnStats", - { - "save_duration": 1.2, - "save_speed": 300.5, - "load_duration": 0.8, - "load_speed": 450.0, - "interval_lookup_hit_rates": 0.95, - }, -) -mon.update_stats( - "ConnStats", - { - "save_duration": 1.2, - "save_speed": 300.5, - "load_duration": 0.8, - "load_speed": 450.0, - "interval_lookup_hit_rates": 0.95, - }, -) - -data = mon.get_stats("ConnStats") -print(data) diff --git a/ucm/shared/test/CMakeLists.txt b/ucm/shared/test/CMakeLists.txt index 07241d814..cf5355f27 100644 --- a/ucm/shared/test/CMakeLists.txt +++ b/ucm/shared/test/CMakeLists.txt @@ -4,7 +4,7 @@ if(BUILD_UNIT_TESTS) add_executable(ucmshared.test ${UCMSHARED_TEST_SOURCE_FILES}) target_include_directories(ucmshared.test PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/case) target_link_libraries(ucmshared.test PRIVATE - trans + trans monitor_static gtest_main gtest mockcpp ) gtest_discover_tests(ucmshared.test) diff --git a/ucm/shared/test/case/metrics/monitor_test.cc b/ucm/shared/test/case/metrics/monitor_test.cc new file mode 100644 index 000000000..a53c0f419 --- /dev/null +++ b/ucm/shared/test/case/metrics/monitor_test.cc @@ -0,0 +1,123 @@ +/** + * MIT License + * + * Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in all + * copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * */ +#include +#include +#include "stats_monitor_api.h" + +using namespace UC::Metrics; + +class UCStatsMonitorUT : public testing::Test { +protected: + void SetUp() override + { + try { + CreateStats("test_stats"); + CreateStats("stats1"); + CreateStats("stats2"); + } catch (const std::exception& e) { + throw; + } + } +}; + +TEST_F(UCStatsMonitorUT, UpdateAndGetStats) +{ + std::string statsName = "test_stats"; + + std::unordered_map params; + params["value1"] = 10.5; + params["value2"] = 20.0; + UpdateStats(statsName, params); + + StatsResult result = GetStats(statsName); + ASSERT_EQ(result.data.size(), 2); + ASSERT_EQ(result.data["value1"][0], 10.5); + ASSERT_EQ(result.data["value2"][0], 20.0); + + params["value1"] = 30.5; + UpdateStats(statsName, params); + + result = GetStats(statsName); + ASSERT_EQ(result.data["value1"].size(), 2); + ASSERT_EQ(result.data["value1"][1], 30.5); + ASSERT_EQ(result.data["value2"].size(), 2); + ASSERT_EQ(result.data["value2"][1], 20.0); + + StatsResult clearResult = GetStatsAndClear(statsName); + ASSERT_EQ(clearResult.data.size(), 2); + ASSERT_EQ(clearResult.data["value1"].size(), 2); + ASSERT_EQ(clearResult.data["value2"].size(), 2); + + result = GetStats(statsName); + EXPECT_TRUE(result.data.empty() || + (result.data["value1"].empty() && result.data["value2"].empty())); + + UpdateStats(statsName, params); + ResetStats(statsName); + result = GetStats(statsName); + EXPECT_TRUE(result.data.empty() || + (result.data["value1"].empty() && result.data["value2"].empty())); +} + +TEST_F(UCStatsMonitorUT, MultipleStatsAndResetAll) +{ + std::string stats1 = "stats1"; + std::string stats2 = "stats2"; + + UpdateStats(stats1, { + {"a", 1.0}, + {"b", 2.0} + }); + UpdateStats(stats2, { + {"c", 3.0}, + {"d", 4.0} + }); + + ASSERT_EQ(GetStats(stats1).data.size(), 2); + ASSERT_EQ(GetStats(stats2).data.size(), 2); + + ResetAllStats(); + EXPECT_TRUE(GetStats(stats1).data.empty() || + (GetStats(stats1).data["a"].empty() && GetStats(stats1).data["b"].empty())); + EXPECT_TRUE(GetStats(stats2).data.empty() || + (GetStats(stats2).data["c"].empty() && GetStats(stats2).data["d"].empty())); +} + +TEST_F(UCStatsMonitorUT, MultipleStatsAndGetAll) +{ + std::string statsA = "stats1"; + std::string statsB = "stats2"; + + UpdateStats(statsA, { + {"x", 100.0} + }); + UpdateStats(statsB, { + {"y", 200.0} + }); + + StatsResult allStats = GetAllStatsAndClear(); + ASSERT_EQ(allStats.data.size(), 2); + + EXPECT_TRUE(!allStats.data.count(statsA) || !allStats.data.count(statsB)); +} \ No newline at end of file diff --git a/ucm/shared/test/example/metrics/monitor_stats_example.py b/ucm/shared/test/example/metrics/monitor_stats_example.py new file mode 100644 index 000000000..aa7d68264 --- /dev/null +++ b/ucm/shared/test/example/metrics/monitor_stats_example.py @@ -0,0 +1,108 @@ +# -*- coding: utf-8 -*- +# +# MIT License +# +# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved. +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in all +# copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# +from functools import wraps + +from ucm.shared.metrics import ucmmonitor + + +def test_wrap(func): + @wraps(func) + def wrapper(*args, **kwargs): + print(f"========>> Running in {func.__name__}:") + result = func(*args, **kwargs) + print() + return result + + return wrapper + + +@test_wrap +def metrics_with_update_stats(): + ucmmonitor.create_stats("PyStats") + ucmmonitor.update_stats( + "PyStats", + { + "save_duration": 1.2, + "save_speed": 300.5, + "load_duration": 0.8, + "load_speed": 450.0, + "interval_lookup_hit_rates": 0.95, + }, + ) + + data = ucmmonitor.get_stats("PyStats").data + assert data["save_duration"][0] == 1.2 + assert len(data) == 5 + print(f"Get PyStats stats: {data}") + + data = ucmmonitor.get_stats_and_clear("PyStats").data + assert data["save_duration"][0] == 1.2 + assert len(data) == 5 + print(f"Get PyStats stats and clear: {data}") + + data = ucmmonitor.get_stats_and_clear("PyStats").data + assert len(data) == 0 + print(f"After clear then get PyStats: {data}") + + +@test_wrap +def metrics_with_update_all_stats(): + ucmmonitor.create_stats("PyStats1") + ucmmonitor.create_stats("PyStats2") + ucmmonitor.update_stats( + "PyStats1", + { + "save_duration": 1.2, + "save_speed": 300.5, + }, + ) + + ucmmonitor.update_stats( + "PyStats2", + { + "load_duration": 0.8, + "load_speed": 450.0, + }, + ) + + data = ucmmonitor.get_stats("PyStats1").data + assert data["save_duration"][0] == 1.2 + assert len(data) == 2 + print(f"Only get PyStats1 stats: {data}") + + data = ucmmonitor.get_all_stats_and_clear().data + assert data["save_duration"][0] == 1.2 + assert data["load_duration"][0] == 0.8 + assert len(data) == 4 + print(f"Get all stats and clear: {data}") + + data = ucmmonitor.get_stats("PyStats2").data + assert len(data) == 0 + print(f"After clear then get PyStats2: {data}") + + +if __name__ == "__main__": + metrics_with_update_stats() + metrics_with_update_all_stats()