Skip to content

Conversation

@rinfx
Copy link
Collaborator

@rinfx rinfx commented Oct 28, 2025

Ⅰ. Describe what this PR did

  1. 支持根据某个指标进行负载均衡,用户可以在插件配置中配置用于负载均衡的指标名称
  2. 利用cluster_header类型路由跨多个cluste进行负载均衡,支持按并发数、TTFT、RT等

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews


Ⅰ. Describe what this PR did

Supports load balancing based on a certain indicator. Users can configure the indicator name used for load balancing in the plug-in configuration.

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

rinfx added 2 commits October 27, 2025 09:57
Change-Id: I6ecd81daa83b300c6058afee4c9d1c30b71f3a97
Co-developed-by: Cursor <[email protected]>
Change-Id: Id286d1a8241f4fc7cf812b17d9be8c2c272616e1
Co-developed-by: Cursor <[email protected]>
@lingma-agents
Copy link

lingma-agents bot commented Oct 28, 2025

AI负载均衡器支持用户自定义指标

变更概述
  • 新功能

    • 引入基于用户自定义指标的负载均衡策略(metrics_based),允许在插件配置中指定用于调度的指标名称。
    • 新增对UserSelectedMetric的支持,在后端Pod信息中增加用户选择的指标名和值字段,并在调度时使用该指标进行过滤和排序。
    • 添加了三种调度策略:默认、最少(least)和最多(most),分别对应不同的指标筛选逻辑。
  • 重构

    • 将原有的least_busy模块重命名为metrics_based,以更准确地反映其功能并为未来扩展提供灵活性。
    • 调整了包导入路径及内部结构,确保所有引用指向新的目录位置。
  • 配置调整

    • 在主配置文件中新增MetricsBased常量,并将其作为可选的负载均衡策略之一。
    • 修改解析配置函数,使其能够正确处理metrics_based类型的负载均衡器及其相关参数(如metricPolicytargetMetric)。
变更文件
文件路径 变更说明
plugins/​wasm-go/​extensions/​ai-load-balancer/​main.​go 导入了新的`metrics_based`包,并在负载均衡策略枚举中增加了`MetricsBased`选项。同时更新了配置解析逻辑,使系统能识别并初始化基于用户自定义指标的负载均衡器。
plugins/​wasm-go/​extensions/​ai-load-balancer/​metrics_​based/​backend/​types.​go 定义了`UserSelectedMetric`结构体来存储用户选定的指标信息,并扩展了`PodMetrics`类型以包含此字段。此外还增强了字符串表示方法以便调试。
plugins/​wasm-go/​extensions/​ai-load-balancer/​metrics_​based/​backend/​vllm/​metrics.​go 更新Prometheus指标转换函数,优先读取用户配置的目标指标值,若未找到则回退到默认指标。
plugins/​wasm-go/​extensions/​ai-load-balancer/​metrics_​based/​lb_​policy.​go 实现了`NewMetricsBasedLoadBalancer`构造函数,从JSON配置中提取`metricPolicy`和`targetMetric`参数;并在请求处理过程中调用相应的调度器。
plugins/​wasm-go/​extensions/​ai-load-balancer/​metrics_​based/​scheduling/​filter.​go 更新导入路径至新的`metrics_based`模块下。
plugins/​wasm-go/​extensions/​ai-load-balancer/​metrics_​based/​scheduling/​scheduler.​go 新增调度策略常量(默认/最少/最多),并实现对应的过滤算法;改进`GetScheduler`工厂函数,使其可以根据不同策略创建合适的调度器实例。
plugins/​wasm-go/​extensions/​ai-load-balancer/​metrics_​based/​scheduling/​types.​go 文件内容无实质变化,仅随目录结构调整而移动。

💡 小贴士

与 lingma-agents 交流的方式

📜 直接回复评论
直接回复本条评论,lingma-agents 将自动处理您的请求。例如:

  • 在当前代码中添加详细的注释说明。

  • 请详细介绍一下你说的 LRU 改造方案,并使用伪代码加以说明。

📜 在代码行处标记
在文件的特定位置创建评论并 @lingma-agents。例如:

  • @lingma-agents 分析这个方法的性能瓶颈并提供优化建议。

  • @lingma-agents 对这个方法生成优化代码。

📜 在讨论中提问
在任何讨论中 @lingma-agents 来获取帮助。例如:

  • @lingma-agents 请总结上述讨论并提出解决方案。

  • @lingma-agents 请根据讨论内容生成优化代码。

@codecov-commenter
Copy link

codecov-commenter commented Oct 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 43.46%. Comparing base (ef31e09) to head (94af382).
⚠️ Report is 774 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3063      +/-   ##
==========================================
+ Coverage   35.91%   43.46%   +7.55%     
==========================================
  Files          69       82      +13     
  Lines       11576    10917     -659     
==========================================
+ Hits         4157     4745     +588     
+ Misses       7104     5844    -1260     
- Partials      315      328      +13     

see 97 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

rinfx added 5 commits October 28, 2025 11:37
Change-Id: Ie3a642c2d4d4a474f39866a3939269fe33e1af12
Co-developed-by: Cursor <[email protected]>
Change-Id: Idb6dcd1fd96f5e120ba6ce45c587cce4281920e5
Co-developed-by: Cursor <[email protected]>
Change-Id: I6c38e4e5d5da6d79d2b4987124db8ac1f36e8346
Co-developed-by: Cursor <[email protected]>
Change-Id: I9de1c1b6b7554eae242b86b4ffaae45052af1de6
Co-developed-by: Cursor <[email protected]>
Change-Id: I78717907311fd607a1ce40a7a2ccaaaaa310004d
Co-developed-by: Cursor <[email protected]>
@rinfx rinfx changed the title [feat] ai load balancer support user defined metrics [feat] load balancing across different clusters and endpoints based on vllm metrics Nov 6, 2025
@rinfx rinfx changed the title [feat] load balancing across different clusters and endpoints based on vllm metrics [feat] load balancing across different clusters and endpoints based on metrics Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants