-
Notifications
You must be signed in to change notification settings - Fork 10
Analytics: Analytics Dashboard APIs #888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
b785e30
feat(analytics): create the analytics related apis
Ayush8923 5ca387b
fix(analytics): few fixes
Ayush8923 ef396c9
fix(analytics): remove the unwanted js comments
Ayush8923 a0e4142
Merge branch 'main' into feat/analytics-dashboard-api
Ayush8923 351d632
fix(analytics): added the test cases for this
Ayush8923 0df3a31
Merge branch 'feat/analytics-dashboard-api' of https://github.com/Pro…
Ayush8923 84d1123
fix(analytics): added the test cases for this
Ayush8923 b876837
fix(analytics): added the test cases for this
Ayush8923 07059e6
fix(analytics): refactor business
Ayush8923 917608d
fix(analytics): test cases
Ayush8923 b27ac4f
Merge branch 'main' of https://github.com/ProjectTech4DevAI/kaapi-bac…
Ayush8923 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,167 @@ | ||
| Read live monthly analytics for the current organization. | ||
|
|
||
| The response is shaped as a list of data points, one per | ||
| `(month, modality, provider)` combination — aggregated across every project | ||
| in the caller's organization. Each point contains a single numeric `value` — | ||
| what that value represents depends on the `metric` query parameter. This | ||
| lets the frontend pivot the response directly into chart series without | ||
| further post-processing. | ||
|
|
||
| Data is computed on-demand from `llm_call`, `llm_chain`, and | ||
| `evaluation_run`, so every request reflects the current database state with | ||
| no caching layer in between. A row inserted seconds ago will already be | ||
| visible in the response. | ||
|
|
||
| --- | ||
|
|
||
| ## Authentication & default scope | ||
|
|
||
| Any authenticated user with an organization context can call this endpoint. | ||
| Scope is decided per-request from the caller's auth context: | ||
|
|
||
| | Caller's context | Default scope | | ||
| | ---------------- | ------------- | | ||
| | Currently selected project | Analytics for **just that project**. | | ||
| | Org-level (no project selected) | Analytics across **all projects in the caller's org**. | | ||
|
|
||
| The implicit org-id filter is always applied first, so data from other | ||
| organizations is never returned. To override the default and look at a | ||
| specific project (e.g. an org admin comparing two projects), pass the | ||
| `project_id` query parameter — it must reference a project inside the | ||
| caller's organization. A `project_id` from a different org returns an | ||
| empty result, not a leak. | ||
|
|
||
| --- | ||
|
|
||
| ## Query parameters | ||
|
|
||
| | Parameter | Type | Required | Default | Description | | ||
| | ----------- | -------- | -------- | ------- | ----------- | | ||
| | `metric` | enum | **yes** | — | Which metric the `value` field carries on each point. One of: `requests`, `cost`, `eval_runs`, `eval_cost`. | | ||
| | `from_month`| date | no | 24 months before `to_month` (or before today if `to_month` is also omitted) | Inclusive lower bound. Must be a first-of-month date, e.g. `2026-01-01`. Pass an explicit value to query further back. The default exists to cap worst-case scan size as `llm_call` grows. | | ||
| | `to_month` | date | no | — (no upper bound) | Inclusive upper bound. Must be a first-of-month date, e.g. `2026-05-01`. | | ||
| | `modality` | enum | no | — (all) | Filter to a single modality bucket. One of: `T-FS-T`, `S-FS-S`, `STT`, `TTS`, `OTHER`. | | ||
| | `provider` | string | no | — (all) | Filter to a single provider, e.g. `openai`, `google`, `sarvamai`, `elevenlabs`. | | ||
| | `project_id`| integer | no | Caller's current project, if any; else all projects in the org. | Override the default scope. Must reference a project inside the caller's organization. Cross-organization access is rejected (the org filter is always applied first). | | ||
|
|
||
| ### `metric` values | ||
|
|
||
| | Value | What `value` contains on each point | | ||
| | ------------ | ----------------------------------- | | ||
| | `requests` | `total_llm_call_requests + total_llm_chain_requests` — the total number of inference requests in the bucket (LLM calls plus chain orchestrations). | | ||
| | `cost` | Sum of LLM call cost in USD for the bucket. Chains are NOT added on top — a chain's cost equals the sum of its child calls, which are already counted. | | ||
| | `eval_runs` | Count of evaluation runs in the bucket. | | ||
| | `eval_cost` | Sum of evaluation run cost in USD for the bucket. | | ||
|
|
||
| ### `modality` values and how they're derived | ||
|
|
||
| | Modality | LLM call (`input_type` → `output_type`) | Evaluation run `type` | | ||
| | -------- | --------------------------------------- | --------------------- | | ||
| | `T-FS-T` | `text` → `text` | `text` | | ||
| | `S-FS-S` | `audio` → `audio` | — | | ||
| | `STT` | `audio` → `text` | `stt` | | ||
| | `TTS` | `text` → `audio` | `tts` | | ||
| | `OTHER` | anything else (image, pdf, multimodal) | `assessment`, any other type | | ||
|
|
||
| LLM chains are attributed to the modality of their **first child call**. | ||
|
|
||
| --- | ||
|
|
||
| ## Response shape | ||
|
|
||
| ```json | ||
| { | ||
| "success": true, | ||
| "data": [ | ||
| { | ||
| "month": "2026-03-01", | ||
| "modality": "T-FS-T", | ||
| "provider": "openai", | ||
| "value": "12450", | ||
| "input_tokens": 1250000, | ||
| "output_tokens": 820000, | ||
| "total_tokens": 2070000 | ||
| }, | ||
| { | ||
| "month": "2026-04-01", | ||
| "modality": "T-FS-T", | ||
| "provider": "openai", | ||
| "value": "18230", | ||
| "input_tokens": 1840000, | ||
| "output_tokens": 1210000, | ||
| "total_tokens": 3050000 | ||
| }, | ||
| { | ||
| "month": "2026-04-01", | ||
| "modality": "STT", | ||
| "provider": "sarvamai", | ||
| "value": "1402", | ||
| "input_tokens": 0, | ||
| "output_tokens": 0, | ||
| "total_tokens": 0 | ||
| } | ||
| ], | ||
| "error": null, | ||
| "metadata": null | ||
| } | ||
| ``` | ||
|
|
||
| Rows are sorted by `month`, then `modality`, then `provider`. Cost values | ||
| are decimal strings with up to 6 decimal places (e.g. `"12.450000"`). | ||
|
|
||
| Token fields (`input_tokens`, `output_tokens`, `total_tokens`) are sourced | ||
| from `llm_call.usage` and are independent of the chosen `metric` — they | ||
| are populated on every point regardless of whether you asked for | ||
| `requests`, `cost`, or eval metrics. This lets the frontend render token | ||
| usage in a tooltip or secondary axis without a second API call. | ||
|
|
||
| Tokens contributed only by `llm_call` rows. Chains and evaluation runs | ||
| add nothing to token totals — chain tokens are the sum of their child | ||
| calls (would double-count), and eval tokens live in a separate domain. | ||
|
|
||
| If no data matches the filters, `data` is an empty array — this is not an | ||
| error. | ||
|
|
||
| --- | ||
|
|
||
| ## Example requests | ||
|
|
||
| ### 1. Total monthly cost across all modalities and providers | ||
|
|
||
| ``` | ||
| GET /api/analytics/monthly?metric=cost&from_month=2026-01-01&to_month=2026-05-01 | ||
| ``` | ||
|
|
||
| ### 2. Just the OpenAI text-to-text request volume | ||
|
|
||
| ``` | ||
| GET /api/analytics/monthly?metric=requests&modality=T-FS-T&provider=openai | ||
| ``` | ||
|
|
||
| ### 3. STT evaluation run costs this year | ||
|
|
||
| ``` | ||
| GET /api/analytics/monthly?metric=eval_cost&modality=STT&from_month=2026-01-01 | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Notes on accuracy | ||
|
|
||
| - **Live reads**: every request runs a fresh `GROUP BY` against the source | ||
| tables, so the response always reflects the current database. There is | ||
| no daily aggregation cron and no staleness window. | ||
| - **Default time window** is the last 24 months. When `from_month` is | ||
| omitted, the query is bounded to that range so an unfiltered call can't | ||
| trigger a full-table scan as the source tables grow. Pass an explicit | ||
| `from_month` to query further back. | ||
| - **Missing pricing** for a provider/model yields a cost of `0` for those | ||
| rows rather than failing the whole query. Make sure your | ||
| `ModelConfig.pricing` is populated for every provider/model you use if | ||
| you want accurate cost numbers. | ||
| - **Cost is not double-counted across chains**: a chain row contributes | ||
| only to the `requests` metric (via the chain count), never to `cost` — | ||
| its dollars come from the underlying `llm_call` rows. | ||
| - **Cost computed on summed tokens per (provider, model) group**, which is | ||
| equivalent to per-row pricing because `estimate_model_cost` is linear in | ||
| token counts. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,147 @@ | ||
| Chart-shaped live monthly analytics for the current organization. | ||
|
|
||
| Use this endpoint when you want to render the data directly as a line, bar, | ||
| or stacked-area chart. Numbers are computed on-demand from `llm_call`, | ||
| `llm_chain`, and `evaluation_run` — no caching layer, so the chart always | ||
| reflects the current database state. The response shape is compatible with | ||
| most chart libraries (Recharts, Chart.js, ApexCharts, Highcharts, ECharts): | ||
|
|
||
| - `labels[]` — the x-axis values (one entry per month). | ||
| - `series[]` — one entry per chart line/bar, each with a human-readable | ||
| `name` and a `data[]` array. `series[i].data[j]` corresponds to | ||
| `labels[j]`. Missing months are filled with `0` so every series has the | ||
| same length as `labels`. | ||
|
|
||
| For a flat row-per-bucket shape (suitable when you want to do your own | ||
| pivoting), use `GET /api/analytics/monthly` instead. | ||
|
|
||
| --- | ||
|
|
||
| ## Authentication & default scope | ||
|
|
||
| Any authenticated user with an organization context can call this endpoint. | ||
| By default it returns data scoped to the caller's **currently selected | ||
| project**; if the caller has no project selected, it falls back to all | ||
| projects in the caller's organization. Pass `project_id` to override the | ||
| default — it must reference a project inside the caller's organization, so | ||
| cross-organization access is never possible. | ||
|
|
||
| --- | ||
|
|
||
| ## Query parameters | ||
|
|
||
| | Parameter | Type | Required | Default | Description | | ||
| | ----------- | ------- | -------- | ---------------------- | ----------- | | ||
| | `metric` | enum | **yes** | — | Which metric to plot. One of: `requests`, `cost`, `eval_runs`, `eval_cost`. | | ||
| | `group_by` | enum | no | `modality_provider` | How to split the data into series. See the table below. | | ||
| | `from_month`| date | no | 24 months before `to_month` (or before today if `to_month` is also omitted) | Inclusive lower bound (first-of-month), e.g. `2026-01-01`. Pass an explicit value to query further back. The default exists to cap worst-case scan size as the source tables grow. | | ||
| | `to_month` | date | no | — (no upper bound) | Inclusive upper bound (first-of-month), e.g. `2026-05-01`. | | ||
| | `modality` | enum | no | — (all) | Pre-filter to a single modality bucket. | | ||
| | `provider` | string | no | — (all) | Pre-filter to a single provider. | | ||
| | `project_id`| integer | no | Caller's current project, if any; else all projects in the org. | Override the default scope. Must reference a project inside the caller's organization. | | ||
|
|
||
| ### `group_by` values | ||
|
|
||
| | Value | Series produced | | ||
| | --------------------- | ---------------------------------------------------------------------------- | | ||
| | `modality_provider` | One series per `(modality, provider)` combination. Series name: `"T-FS-T · openai"`. | | ||
| | `modality` | One series per modality, summed across providers. Series name: `"T-FS-T"`. | | ||
| | `provider` | One series per provider, summed across modalities. Series name: `"openai"`. | | ||
| | `total` | A single series containing the per-month grand total. Series name: `"total"`. | | ||
|
|
||
| --- | ||
|
|
||
| ## Response shape | ||
|
|
||
| ```json | ||
| { | ||
| "success": true, | ||
| "data": { | ||
| "metric": "cost", | ||
| "group_by": "modality_provider", | ||
| "labels": ["2026-01-01", "2026-02-01", "2026-03-01", "2026-04-01"], | ||
| "series": [ | ||
| { | ||
| "name": "T-FS-T · openai", | ||
| "data": ["10.500000", "15.400000", "18.700000", "22.100000"], | ||
| "total_input_tokens": 4250000, | ||
| "total_output_tokens": 2810000, | ||
| "total_tokens": 7060000 | ||
| }, | ||
| { | ||
| "name": "T-FS-T · google", | ||
| "data": ["5.100000", "6.300000", "8.200000", "12.400000"], | ||
| "total_input_tokens": 1820000, | ||
| "total_output_tokens": 1240000, | ||
| "total_tokens": 3060000 | ||
| }, | ||
| { | ||
| "name": "STT · sarvamai", | ||
| "data": ["0", "0.800000", "1.200000", "1.900000"], | ||
| "total_input_tokens": 0, | ||
| "total_output_tokens": 0, | ||
| "total_tokens": 0 | ||
| } | ||
| ] | ||
| }, | ||
| "error": null, | ||
| "metadata": null | ||
| } | ||
| ``` | ||
|
|
||
| - `labels` are sorted chronologically (oldest → newest). | ||
| - `series` are sorted alphabetically by `name`. | ||
| - All `series[].data` arrays have the same length as `labels`. Months with | ||
| no data for a given series are filled with `0`, so the chart library | ||
| doesn't have to align points itself. | ||
| - Cost values are decimal strings with up to 6 decimal places. | ||
| - `total_input_tokens`, `total_output_tokens`, and `total_tokens` on each | ||
| series are series-wide sums across every label, sourced from | ||
| `llm_call.usage`. They are independent of the chosen `metric` — populated | ||
| whether you're charting requests, cost, or eval numbers. Chains and | ||
| evaluation runs contribute zero to token totals. | ||
| - An empty result returns `labels: []` and `series: []`. | ||
|
|
||
| --- | ||
|
|
||
| ## Example requests | ||
|
|
||
| ### 1. Monthly cost grouped by provider (one line per provider) | ||
|
|
||
| ``` | ||
| GET /api/analytics/monthly/chart?metric=cost&group_by=provider | ||
| ``` | ||
|
Ayush8923 marked this conversation as resolved.
|
||
|
|
||
| ### 2. Total request volume across all dimensions (single line) | ||
|
|
||
| ``` | ||
| GET /api/analytics/monthly/chart?metric=requests&group_by=total | ||
| ``` | ||
|
|
||
| ### 3. STT-only eval cost trend for the year | ||
|
|
||
| ``` | ||
| GET /api/analytics/monthly/chart?metric=eval_cost&modality=STT&from_month=2026-01-01 | ||
| ``` | ||
|
|
||
| ### 4. Cost split by modality for a specific project | ||
|
|
||
| ``` | ||
| GET /api/analytics/monthly/chart?metric=cost&group_by=modality&project_id=42 | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Frontend integration tips | ||
|
|
||
| **Recharts**: pass `labels` as the X-axis source and render one `<Line>`, | ||
| `<Bar>`, or `<Area>` per item in `series`, using `series[i].name` as the | ||
| key and the values from `series[i].data`. | ||
|
|
||
| **Chart.js / ApexCharts**: the shape is almost their native config — the | ||
| `labels` array maps to their `labels`/`categories`, and each series object | ||
| maps to `datasets[]` / `series[]`. | ||
|
|
||
| For a **stacked area chart** of cost by provider over time, use | ||
| `metric=cost` and `group_by=provider` — the response is already | ||
| chart-ready. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.