Proposal: Query, Signal, and Ranking Pipeline for Drop Discovery

## Goal

Extend Drop's `DiscoveryPolicy` so image discovery can rank images with more than a single usage-count score.

The proposed model is:

```text
queries -> signals -> ranking -> selected images
```

This separates data collection from scoring:

- **Queries** fetch raw data from systems such as Prometheus or Loki.
- **Signals** derive named per-image metrics from query results.
- **Ranking strategies** combine one or more signals into the final ordered image list.

The goal is to support practical image prewarming strategies for Kubernetes CI/CD workloads, especially GitLab Kubernetes executor node pools.

---

## Problem

A simple count-based discovery strategy answers:

> Which images appeared most often?

That is useful, but incomplete.

CI workloads have different shapes:

- some images are used steadily throughout the day,
- some images are used mainly during developer feedback hours,
- some images appear in short high-concurrency bursts,
- some images are used in nightly validation jobs,
- some images are not frequent but are expensive when cold,
- some images matter because node rotation leaves many nodes cold for them.

To support these cases, Drop needs named input data, reusable derived signals, and explicit ranking logic.

---

## Design Overview

A `DiscoveryPolicy` should define:

```yaml
spec:
  queries: []
  signals: []
  ranking: {}
```

### Query

A query fetches raw observations.

Examples:

- Prometheus range query for image usage.
- Loki range query for Kubernetes image pull events.
- Future external pull-cost profile.

### Signal

A signal derives a named per-image value from query results.

Examples:

- `total-usage`
- `peak-concurrency`
- `developer-weighted-usage`
- `recent-usage`
- `p50-cold-pull-time`

### Ranking

A ranking strategy combines signals into the final score.

Examples:

- rank by one signal,
- weighted sum of normalized signals,
- model-aware exposure score.

---

# Discovery Strategies

## 1. Total Usage

Ranks images by total observed usage over a lookback window.

```text
score(I) = sum(count_I(t) for t in W)
```

Required signal:

```text
total-usage
```

Required query:

```text
Prometheus image-usage range query
```

Use when:

- the workload is stable,
- the goal is a simple hot-image baseline,
- the user wants the most commonly observed images.

Limitation:

- May miss images that are not globally frequent but appear in large bursts.

---

## 2. Peak Same-Image Concurrency

Ranks images by maximum observed concurrent usage.

```text
score(I) = max(count_I(t) for t in W)
```

Required signal:

```text
peak-concurrency
```

Required query:

```text
Prometheus image-usage range query
```

Use when:

- CI has fan-out stages,
- CI has scheduled high-volume jobs,
- nightly validation jobs create many Pods using the same image,
- registry pressure from synchronized cold pulls is a concern.

Limitation:

- A rare spike can dominate if this is used alone.

---

## 3. Developer-Time Weighted Usage

Ranks images by usage during configured developer feedback windows.

```text
score(I) = sum(weight(t) * count_I(t) for t in W)
```

Example weighting:

| Time window | Weight |
|---|---:|
| 07:00-09:00 | 0.3 |
| 09:00-17:00 | 1.0 |
| 17:00-20:00 | 0.3 |
| otherwise | 0.0 |

Required signal:

```text
developer-weighted-usage
```

Required query:

```text
Prometheus image-usage range query
```

Use when:

- optimizing developer feedback time,
- the team has known working-hour patterns,
- interactive CI matters more than background/nightly work.

Limitation:

- Requires timezone and window configuration.
- May not fit globally distributed teams without multiple windows or broader policies.

---

## 4. Recent Usage

Ranks images by usage in a short recent window.

```text
score(I) = sum(count_I(t) for t in recent window)
```

Required signal:

```text
recent-usage
```

Required query:

```text
Prometheus image-usage range query
```

Use when:

- image usage changes quickly,
- new images are introduced often,
- short-lived project activity should influence prewarming.

Limitation:

- Can overreact to temporary spikes.

---

## 5. Hybrid Usage + Peak Concurrency

Balances generally hot images and burst-heavy images.

```text
score(I) =
  alpha * normalize(total_usage(I))
  + (1 - alpha) * normalize(peak_concurrency(I))
```

Example:

```text
alpha = 0.7
```

Meaning:

```text
70% total usage
30% peak concurrency
```

Required signals:

```text
total-usage
peak-concurrency
```

Required query:

```text
Prometheus image-usage range query
```

Use when:

- the cluster has mixed workloads,
- both steady hot images and bursty images matter,
- pure count and pure max are both too narrow.

Limitation:

- Requires normalization and explainable status output.

---

## 6. Hybrid Developer-Time Usage + Peak Concurrency

Balances developer-feedback relevance with burst detection.

```text
score(I) =
  alpha * normalize(developer_weighted_usage(I))
  + (1 - alpha) * normalize(peak_concurrency(I))
```

Required signals:

```text
developer-weighted-usage
peak-concurrency
```

Required query:

```text
Prometheus image-usage range query
```

Use when:

- developer feedback is the primary goal,
- but off-hour bursts still matter operationally.

Limitation:

- Requires both time-window weighting and normalization.

---

## 7. Count × Pull Time

Ranks images by usage multiplied by measured image availability time.

```text
score(I) = total_usage(I) * p_hat(I)
```

Required signals:

```text
total-usage
p50-cold-pull-time
```

or:

```text
total-usage
p95-cold-pull-time
```

Required queries:

```text
Prometheus image-usage query
Loki pull-event query or external pull-cost profile
```

Use when:

- image pull costs vary significantly,
- a medium-frequency but expensive image should outrank a tiny frequent image.

Limitation:

- Requires per-image pull-time estimates.

---

## 8. Developer-Weighted Count × Pull Time

Ranks developer-relevant images by estimated cold-start cost.

```text
score(I) = developer_weighted_usage(I) * p_hat(I)
```

Required signals:

```text
developer-weighted-usage
p50-cold-pull-time
```

Required queries:

```text
Prometheus image-usage query
Loki pull-event query or external pull-cost profile
```

Use when:

- the goal is reducing developer-facing affected job-minutes.

Limitation:

- Requires time-window configuration and pull-time estimates.

---

## 9. Model-Aware Exposure

Ranks images by estimated post-rotation cold-node exposure.

```text
score(I) =
  J_target(I)
  * cold_fraction_hat(I)
  * p_hat(I)
```

with:

```text
cold_fraction_hat(I) = (1 - 1/N) ^ J_pre(I)
```

Where:

- `N` is the number of eligible CI nodes,
- `J_pre(I)` is usage before the target window,
- `J_target(I)` is usage during the target window,
- `p_hat(I)` is measured or estimated image availability time.

Required signals:

```text
pre-window-usage
target-window-usage
p50-cold-pull-time
```

Required configuration:

```text
nodeCount
```

Required queries:

```text
Prometheus image-usage query
Loki pull-event query or external pull-cost profile
```

Use when:

- prewarming should be node-rotation-aware,
- enough observability exists to estimate pull time,
- the user wants a closer approximation of affected job-minutes.

Limitation:

- More assumptions than usage-only strategies.
- Should be implemented as a typed ranking strategy.

---

# Required Pipeline Capabilities

## Query Types

### Prometheus

Used for:

- total usage,
- peak concurrency,
- developer-time usage,
- recent usage,
- pre-window usage,
- target-window usage.

Normalized output:

```text
timestamp,image,value
```

### Loki

Used for Kubernetes image-pull event analysis when Prometheus does not expose useful per-image pull durations.

Normalized output:

```text
timestamp,pod,image,reason,message
```

### Pull Cost Profile

Optional future alternative to Loki.

Normalized output:

```text
image,p50ColdPullSeconds,p95ColdPullSeconds,sampleCount
```

This can be generated by an external analyzer if pull-time parsing should not live inside the Drop controller.

---

## Signal Types

| Signal type | Purpose | Example signals |
|---|---|---|
| `aggregate` | Aggregate all samples per image | `total-usage`, `peak-concurrency` |
| `timeWeightedAggregate` | Apply time-window weights before aggregation | `developer-weighted-usage` |
| `windowAggregate` | Aggregate a specific sub-window | `recent-usage`, `pre-window-usage`, `target-window-usage` |
| `eventPullTime` | Derive pull-time stats from events | `p50-cold-pull-time`, `p95-cold-pull-time` |

---

## Ranking Strategies

| Ranking strategy | Purpose |
|---|---|
| `signal` | Rank directly by one signal |
| `weightedSum` | Combine normalized signals |
| `modelExposure` | Rank by expected post-rotation exposure |

---

# Proposed CRD Shape

## Overview

```yaml
apiVersion: drop.corewire.io/v1alpha1
kind: DiscoveryPolicy
metadata:
  name: gitlab-runner-discovery
spec:
  syncInterval: 1h
  maxImages: 30

  queries: []
  signals: []
  ranking: {}
```

---

# Queries

## Prometheus Image Usage Query

```yaml
queries:
  - name: runner-image-usage
    type: prometheus
    prometheus:
      endpoint: https://mimir.example.com
      queryType: range
      lookback: 168h
      step: 1m
      query: |
        count(
          container_memory_working_set_bytes{
            container!="",
            container!="POD",
            namespace="gitlab-runner",
            pod=~"runner-.*"
          }
        ) by (image)
```

The query must return an `image` label.

Normalized result:

```text
timestamp,image,value
```

Example:

```text
2026-06-18T09:00:00Z,registry.example.com/ci/node-build:22,18
2026-06-18T09:01:00Z,registry.example.com/ci/node-build:22,21
```

---

## Loki Image Pull Event Query

```yaml
queries:
  - name: image-pull-events
    type: loki
    loki:
      endpoint: https://loki.example.com
      queryType: range
      lookback: 168h
      query: |
        {job="kubernetes-events", namespace="gitlab-runner"}
        | json
        | involvedObject_name =~ "runner-.*"
        | reason =~ "Pulling|Pulled|Failed|BackOff"
      parser:
        type: kubernetesEvents
        podField: involvedObject_name
        reasonField: reason
        messageField: message
        imageField: message
```

Normalized result:

```text
timestamp,pod,image,reason,message
```

Expected event messages include:

```text
Pulling image "registry.example.com/ci/java-gradle:21"
Successfully pulled image "registry.example.com/ci/java-gradle:21" in 42.3s
Container image "registry.example.com/ci/java-gradle:21" already present on machine
Failed to pull image "registry.example.com/ci/java-gradle:21"
Back-off pulling image "registry.example.com/ci/java-gradle:21"
```

---

# Signals

## `aggregate`

Aggregates all samples per image.

Supported methods:

```text
sum
max
avg
count
min
```

Total usage:

```yaml
signals:
  - name: total-usage
    queryRef: runner-image-usage
    type: aggregate
    aggregate:
      method: sum
```

Peak concurrency:

```yaml
signals:
  - name: peak-concurrency
    queryRef: runner-image-usage
    type: aggregate
    aggregate:
      method: max
```

---

## `timeWeightedAggregate`

Applies configured time weights before aggregation.

```yaml
signals:
  - name: developer-weighted-usage
    queryRef: runner-image-usage
    type: timeWeightedAggregate
    timeWeightedAggregate:
      method: sum
      timezone: Europe/Berlin
      defaultWeight: "0"
      windows:
        - startHour: 7
          endHour: 9
          weight: "0.3"
        - startHour: 9
          endHour: 17
          weight: "1.0"
        - startHour: 17
          endHour: 20
          weight: "0.3"
```

---

## `windowAggregate`

Aggregates a specific time window.

Recent usage:

```yaml
signals:
  - name: recent-usage
    queryRef: runner-image-usage
    type: windowAggregate
    windowAggregate:
      method: sum
      relativeWindow: 2h
```

Pre-window usage:

```yaml
signals:
  - name: pre-window-usage
    queryRef: runner-image-usage
    type: windowAggregate
    windowAggregate:
      method: sum
      timezone: Europe/Berlin
      window:
        start: "00:00"
        end: "09:00"
```

Target-window usage:

```yaml
signals:
  - name: developer-window-usage
    queryRef: runner-image-usage
    type: windowAggregate
    windowAggregate:
      method: sum
      timezone: Europe/Berlin
      window:
        start: "09:00"
        end: "17:00"
```

---

## `eventPullTime`

Derives image pull-time statistics from event records.

```yaml
signals:
  - name: p50-cold-pull-time
    queryRef: image-pull-events
    type: eventPullTime
    eventPullTime:
      statistic: p50
      includeCacheHits: false
      durationMode: eventPair
```

Supported statistics:

```text
p50
p90
p95
avg
max
count
failureCount
cacheHitCount
```

Supported duration modes:

| Mode | Meaning |
|---|---|
| `eventPair` | `Pulled.timestamp - Pulling.timestamp` for the same Pod/image |
| `messageDuration` | parse duration from a `Pulled` event message |

Cache hits should be detected separately and excluded from cold-pull duration when:

```yaml
includeCacheHits: false
```

---

# Ranking Strategies

## `signal`

Ranks directly by one signal.

```yaml
ranking:
  strategy: signal
  signal:
    signalRef: total-usage
```

---

## `weightedSum`

Combines normalized signals.

```yaml
ranking:
  strategy: weightedSum
  weightedSum:
    normalize: minMax
    missingSignal: zero
    terms:
      - signalRef: total-usage
        weight: "0.7"
      - signalRef: peak-concurrency
        weight: "0.3"
```

Formula:

```text
final_score(I) =
  0.7 * normalize(total_usage(I))
  + 0.3 * normalize(peak_concurrency(I))
```

Initial normalization method:

```text
minMax
```

Formula:

```text
normalized(x) = (x - min) / (max - min)
```

If all values are equal:

```text
normalized(x) = 1
```

---

## `modelExposure`

Ranks by expected post-rotation exposure.

```yaml
ranking:
  strategy: modelExposure
  modelExposure:
    nodeCount: 100
    preWindowUsageSignalRef: pre-window-usage
    targetWindowUsageSignalRef: developer-window-usage
    pullTimeSignalRef: p50-cold-pull-time
```

Formula:

```text
score(I) =
  J_target(I)
  * (1 - 1/N) ^ J_pre(I)
  * p_hat(I)
```

---

# Complete Examples

## Example 1: Hybrid Usage and Peak Concurrency

```yaml
apiVersion: drop.corewire.io/v1alpha1
kind: DiscoveryPolicy
metadata:
  name: gitlab-hybrid-usage-concurrency
spec:
  syncInterval: 1h
  maxImages: 30

  queries:
    - name: runner-image-usage
      type: prometheus
      prometheus:
        endpoint: https://mimir.example.com
        queryType: range
        lookback: 168h
        step: 1m
        query: |
          count(
            container_memory_working_set_bytes{
              container!="",
              container!="POD",
              namespace="gitlab-runner",
              pod=~"runner-.*"
            }
          ) by (image)

  signals:
    - name: total-usage
      queryRef: runner-image-usage
      type: aggregate
      aggregate:
        method: sum

    - name: peak-concurrency
      queryRef: runner-image-usage
      type: aggregate
      aggregate:
        method: max

  ranking:
    strategy: weightedSum
    weightedSum:
      normalize: minMax
      missingSignal: zero
      terms:
        - signalRef: total-usage
          weight: "0.7"
        - signalRef: peak-concurrency
          weight: "0.3"
```

---

## Example 2: Developer-Time Usage and Peak Concurrency

```yaml
apiVersion: drop.corewire.io/v1alpha1
kind: DiscoveryPolicy
metadata:
  name: gitlab-developer-and-burst
spec:
  syncInterval: 1h
  maxImages: 30

  queries:
    - name: runner-image-usage
      type: prometheus
      prometheus:
        endpoint: https://mimir.example.com
        queryType: range
        lookback: 168h
        step: 1m
        query: |
          count(
            container_memory_working_set_bytes{
              container!="",
              container!="POD",
              namespace="gitlab-runner",
              pod=~"runner-.*"
            }
          ) by (image)

  signals:
    - name: developer-weighted-usage
      queryRef: runner-image-usage
      type: timeWeightedAggregate
      timeWeightedAggregate:
        method: sum
        timezone: Europe/Berlin
        defaultWeight: "0"
        windows:
          - startHour: 7
            endHour: 9
            weight: "0.3"
          - startHour: 9
            endHour: 17
            weight: "1.0"
          - startHour: 17
            endHour: 20
            weight: "0.3"

    - name: peak-concurrency
      queryRef: runner-image-usage
      type: aggregate
      aggregate:
        method: max

  ranking:
    strategy: weightedSum
    weightedSum:
      normalize: minMax
      missingSignal: zero
      terms:
        - signalRef: developer-weighted-usage
          weight: "0.7"
        - signalRef: peak-concurrency
          weight: "0.3"
```

---

## Example 3: Model-Aware Exposure

```yaml
apiVersion: drop.corewire.io/v1alpha1
kind: DiscoveryPolicy
metadata:
  name: gitlab-model-aware-exposure
spec:
  syncInterval: 1h
  maxImages: 30

  queries:
    - name: runner-image-usage
      type: prometheus
      prometheus:
        endpoint: https://mimir.example.com
        queryType: range
        lookback: 168h
        step: 5m
        query: |
          count(
            container_memory_working_set_bytes{
              container!="",
              container!="POD",
              namespace="gitlab-runner",
              pod=~"runner-.*"
            }
          ) by (image)

    - name: image-pull-events
      type: loki
      loki:
        endpoint: https://loki.example.com
        queryType: range
        lookback: 168h
        query: |
          {job="kubernetes-events", namespace="gitlab-runner"}
          | json
          | involvedObject_name =~ "runner-.*"
          | reason =~ "Pulling|Pulled|Failed|BackOff"
        parser:
          type: kubernetesEvents
          podField: involvedObject_name
          reasonField: reason
          messageField: message
          imageField: message

  signals:
    - name: pre-window-usage
      queryRef: runner-image-usage
      type: windowAggregate
      windowAggregate:
        method: sum
        timezone: Europe/Berlin
        window:
          start: "00:00"
          end: "09:00"

    - name: developer-window-usage
      queryRef: runner-image-usage
      type: windowAggregate
      windowAggregate:
        method: sum
        timezone: Europe/Berlin
        window:
          start: "09:00"
          end: "17:00"

    - name: p50-cold-pull-time
      queryRef: image-pull-events
      type: eventPullTime
      eventPullTime:
        statistic: p50
        includeCacheHits: false
        durationMode: eventPair

  ranking:
    strategy: modelExposure
    modelExposure:
      nodeCount: 100
      preWindowUsageSignalRef: pre-window-usage
      targetWindowUsageSignalRef: developer-window-usage
      pullTimeSignalRef: p50-cold-pull-time
```

---

# Status and Observability

The controller should expose enough status to explain every selected image.

Example:

```yaml
status:
  lastRunTime: "2026-06-18T10:00:00Z"
  observedGeneration: 4

  queryResults:
    - name: runner-image-usage
      type: prometheus
      series: 30
      samples: 60480
      status: success

    - name: image-pull-events
      type: loki
      records: 1820
      status: success

  signalResults:
    - name: total-usage
      images: 30
      status: success

    - name: peak-concurrency
      images: 30
      status: success

  discoveredImages:
    - image: registry.example.com/ci/java-gradle:21
      rank: 1
      finalScore: "0.8768"
      selected: true
      signals:
        - name: total-usage
          rawValue: "8210"
          normalizedValue: "0.824"
        - name: peak-concurrency
          rawValue: "96"
          normalizedValue: "1.0"
      ranking:
        strategy: weightedSum
        terms:
          - signal: total-usage
            weight: "0.7"
            contribution: "0.5768"
          - signal: peak-concurrency
            weight: "0.3"
            contribution: "0.3"
```

Status output should support debugging:

- query failures,
- missing labels,
- missing signals,
- normalization values,
- ranking contributions,
- final selected images.

---

# Validation Plan

## Query Tests

- Prometheus query results are normalized into `timestamp,image,value`.
- Loki query results are normalized into `timestamp,pod,image,reason,message`.
- Missing `image` labels are rejected or ignored according to defined behavior.
- Query failures are surfaced in status.

## Signal Tests

- `aggregate.sum`
- `aggregate.max`
- `aggregate.avg`
- `aggregate.count`
- `timeWeightedAggregate`
- `windowAggregate`
- `eventPullTime`

## Ranking Tests

- `signal`
- `weightedSum`
- `modelExposure`
- missing signal handling,
- normalization behavior,
- deterministic tie-breaking.

## Integration Tests

Use fake Prometheus and Loki responses to verify:

- one query can feed multiple signals,
- multiple signals can feed one ranking,
- selected image order is deterministic,
- status contains query, signal, and ranking details.

---

# Implementation Split

## Issue 1: CRD for Query, Signal, and Ranking Pipeline

Define the `queries`, `signals`, and `ranking` API.

## Issue 2: Prometheus Query Execution

Implement named Prometheus range queries and normalized sample output.

## Issue 3: Aggregate Signals

Implement:

```text
aggregate.sum
aggregate.max
aggregate.avg
aggregate.count
aggregate.min
```

## Issue 4: Basic Ranking

Implement `signal` ranking.

## Issue 5: Weighted Ranking

Implement `weightedSum` ranking with `minMax` normalization.

## Issue 6: Status Output

Expose query results, signal results, ranking contributions, and selected images.

## Issue 7: Time-Based Signals

Implement:

```text
timeWeightedAggregate
windowAggregate
```

## Issue 8: Loki Query Source

Implement Loki range query support.

## Issue 9: Event Pull-Time Signal

Implement `eventPullTime`.

## Issue 10: Model-Aware Exposure Ranking

Implement typed `modelExposure`.

## Issue 11: Documentation

Document:

- total usage,
- peak concurrency,
- developer-time usage,
- hybrid usage/concurrency,
- pull-time-aware ranking,
- model-aware exposure.

---

# Design Decisions to Resolve

## Missing signal behavior

Initial proposal:

```text
missingSignal: zero
```

Alternative:

```text
drop image from ranking if a required signal is missing
```

## Pull-time statistic

Initial proposal:

```text
p50-cold-pull-time
```

Alternative:

```text
p95-cold-pull-time
```

The choice should be configurable.

## Pull-time source

Two options:

1. Native Loki query and `eventPullTime`.
2. External `ImagePullCostProfile` produced by a separate analyzer.

A native Loki source is convenient. An external profile may keep the controller simpler.

---

# Recommendation

Adopt the `queries -> signals -> ranking` pipeline for Drop discovery.

This design supports:

- multiple signals from one query,
- true hybrid ranking,
- Prometheus and Loki inputs,
- pull-time-aware ranking,
- model-aware exposure scoring,
- explainable status output,
- and a clean split into implementation PRs.

The first production-ready strategies should be:

```text
signal(total-usage)
signal(peak-concurrency)
weightedSum(total-usage, peak-concurrency)
signal(developer-weighted-usage)
weightedSum(developer-weighted-usage, peak-concurrency)
```

The advanced strategy should be:

```text
modelExposure(pre-window-usage, target-window-usage, p50-cold-pull-time)
```

Signal type	Purpose	Example signals
`aggregate`	Aggregate all samples per image	`total-usage`, `peak-concurrency`
`timeWeightedAggregate`	Apply time-window weights before aggregation	`developer-weighted-usage`
`windowAggregate`	Aggregate a specific sub-window	`recent-usage`, `pre-window-usage`, `target-window-usage`
`eventPullTime`	Derive pull-time stats from events	`p50-cold-pull-time`, `p95-cold-pull-time`

Mode	Meaning
`eventPair`	`Pulled.timestamp - Pulling.timestamp` for the same Pod/image
`messageDuration`	parse duration from a `Pulled` event message

Ranking strategy	Purpose
`signal`	Rank directly by one signal
`weightedSum`	Combine normalized signals
`modelExposure`	Rank by expected post-rotation exposure

Uh oh!

Proposal: Query, Signal, and Ranking Pipeline for Drop Discovery #55

Description

Goal

Problem

Design Overview

Query

Signal

Ranking

Discovery Strategies

1. Total Usage

2. Peak Same-Image Concurrency

3. Developer-Time Weighted Usage

4. Recent Usage

5. Hybrid Usage + Peak Concurrency

6. Hybrid Developer-Time Usage + Peak Concurrency

7. Count × Pull Time

8. Developer-Weighted Count × Pull Time

9. Model-Aware Exposure

Required Pipeline Capabilities

Query Types

Prometheus

Loki

Pull Cost Profile

Signal Types

Ranking Strategies

Proposed CRD Shape

Overview

Queries

Prometheus Image Usage Query

Loki Image Pull Event Query

Signals

aggregate

timeWeightedAggregate

windowAggregate

eventPullTime

Ranking Strategies

signal

weightedSum

modelExposure

Complete Examples

Example 1: Hybrid Usage and Peak Concurrency

Example 2: Developer-Time Usage and Peak Concurrency

Example 3: Model-Aware Exposure

Status and Observability

Validation Plan

Query Tests

Signal Tests

Ranking Tests

Integration Tests

Implementation Split

Issue 1: CRD for Query, Signal, and Ranking Pipeline

Issue 2: Prometheus Query Execution

Issue 3: Aggregate Signals

Issue 4: Basic Ranking

Issue 5: Weighted Ranking

Issue 6: Status Output

Issue 7: Time-Based Signals

Issue 8: Loki Query Source

Issue 9: Event Pull-Time Signal

Issue 10: Model-Aware Exposure Ranking

Issue 11: Documentation

Design Decisions to Resolve

Missing signal behavior

Pull-time statistic

Pull-time source

Recommendation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`aggregate`

`timeWeightedAggregate`

`windowAggregate`

`eventPullTime`

`signal`

`weightedSum`

`modelExposure`