Bug: fits_resource_constraints compares model memory against system RAM, not GPU VRAM

## Problem

`ModelInfo.fits_resource_constraints()` in `autobot-backend/utils/model_optimization/types.py` compares the estimated model memory requirement against `SystemResources.available_memory_gb` (system RAM). However, LLM models are loaded into **GPU VRAM**, not system RAM.

This means:
- A system with 64GB RAM but 8GB VRAM will allow loading a 30GB model that will OOM on the GPU
- A system with 8GB RAM but 24GB VRAM will reject models that would fit perfectly on the GPU

`SystemResources` has a `gpu_vram_gb` field (added in #1966) but it is never populated by callers, and `fits_resource_constraints()` doesn't use it.

## Discovered During

Working on #1966 (model memory estimation).

## Expected Behavior

- When GPU is available, compare estimated model memory against GPU VRAM
- When CPU-only inference, compare against system RAM
- `SystemResources` callers should populate `gpu_vram_gb` from GPU detection (#1959)

## Impact

**High** — Models may be recommended that OOM on GPU, or rejected despite fitting in VRAM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug: fits_resource_constraints compares model memory against system RAM, not GPU VRAM #2015

Problem

Discovered During

Expected Behavior

Impact

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Bug: fits_resource_constraints compares model memory against system RAM, not GPU VRAM #2015

Description

Problem

Discovered During

Expected Behavior

Impact

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions