VisionFlow 🚀

Advanced Real-Time Object Detection & Action Recognition System

Features

Real-time multi-object tracking with YOLOv8
3D pose estimation using MediaPipe
Action recognition (SlowFast + ST-GCN)
TensorRT optimization for edge deployment
Docker support with GPU acceleration
CI/CD pipeline with automated testing

Prerequisites

Python 3.8+ (recommended: 3.10)
CUDA 11.8+ (for GPU acceleration)
OpenCV 4.7+
At least 4GB RAM
Webcam or video files for testing

Installation

Option 1: Local Setup

Clone the repository:

git clone https://github.com/superuser303/VisionFlow
cd VisionFlow

Create virtual environment:

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install --upgrade pip
pip install -r requirements.txt

Set up Python path:

export PYTHONPATH="${PYTHONPATH}:$(pwd)"  # On Windows: set PYTHONPATH=%PYTHONPATH%;%cd%

Option 2: Docker Setup

Build the container:

docker build -t visionflow:latest .

Run with GPU support:

docker run --gpus all -it --rm \
  -v /dev/video0:/dev/video0 \
  -v $(pwd)/data:/app/data \
  visionflow:latest

Option 3: Dev Container (VS Code)

Open project in VS Code
Install "Dev Containers" extension
Press Ctrl+Shift+P → "Dev Containers: Reopen in Container"

Quick Start

1. Webcam Demo

python scripts/run_webcam.py

2. Process Video File

python scripts/run_webcam.py --source path/to/video.mp4

3. Train Custom Model

python scripts/train.py --config configs/model_config.yaml --data data/dataset.yaml

4. Jupyter Notebook Demo

jupyter notebook notebooks/VisionFlow_Demo.ipynb

Configuration

Edit configs/model_config.yaml to customize:

Detection: Model type, confidence thresholds, input size
Pose Estimation: MediaPipe complexity settings
Tracking: DeepSORT parameters
Training: Batch size, learning rate, epochs

Dataset Setup

For Training Custom Models:

Organize your dataset:

data/custom_dataset/
├── train/
│   ├── images/
│   └── labels/
├── val/
│   ├── images/
│   └── labels/
└── dataset.yaml

Update data/dataset.yaml:

path: ../data/custom_dataset
train: train/images
val: val/images
names:
  0: person
  1: car
  2: custom_class

For Testing:

The system automatically downloads COCO sample images for testing, or creates synthetic test data if download fails.

Usage Examples

Basic Detection

from src.detection.yolo_wrapper import YOLODetector
import cv2

detector = YOLODetector("yolov8n.pt")
frame = cv2.imread("test_image.jpg")
detections = detector.detect(frame)

Pose Estimation

from src.pose.mediapipe_wrapper import PoseEstimator
import cv2

estimator = PoseEstimator()
frame = cv2.imread("person_image.jpg")
landmarks = estimator.estimate(frame)

Video Processing

from src.utils.video_processor import VideoHandler

video = VideoHandler("input_video.mp4")
while True:
    ret, frame = video.read()
    if not ret: break
    # Process frame here

Development

Running Tests

# Install test dependencies
pip install pytest pytest-cov flake8

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=src --cov-report=html

Code Quality

# Linting
flake8 src/ --count --select=E9,F63,F7,F82 --show-source --statistics

# Code formatting
black src/ tests/ scripts/

Pre-commit Hooks

pip install pre-commit
pre-commit install

Deployment

Edge Devices (Jetson/RPi)

python scripts/deploy_edge.py  # Converts to ONNX/TensorRT

Export Models

# Export to ONNX
python -c "from ultralytics import YOLO; YOLO('yolov8n.pt').export(format='onnx')"

# Export to TensorRT (requires TensorRT)
python -c "from ultralytics import YOLO; YOLO('yolov8n.pt').export(format='engine')"

Troubleshooting

Common Issues

Import Errors:

export PYTHONPATH="${PYTHONPATH}:$(pwd)"

CUDA Not Found:
- Install CUDA 11.8+
- Verify: nvidia-smi
- Install PyTorch with CUDA: pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118

Camera Not Working:

# Check available cameras
python -c "import cv2; print([i for i in range(10) if cv2.VideoCapture(i).isOpened()])"

Memory Issues:
- Reduce batch size in configs/model_config.yaml
- Use smaller model: yolov8n.pt instead of yolov8x.pt

Missing Test Data:

python -c "from src.utils.coco_utils import setup_test_dataset; setup_test_dataset('data/samples')"

Performance Optimization

For Real-time Processing:

Use smaller models (yolov8n.pt)
Reduce input resolution (320x320)
Enable TensorRT optimization
Use GPU acceleration

For Accuracy:

Use larger models (yolov8x.pt)
Higher input resolution (640x640)
Lower confidence thresholds
Custom training on domain-specific data

API Reference

Detection API

YOLODetector.detect(frame) → Returns bounding boxes
YOLODetector.class_names → Class name mapping

Pose API

PoseEstimator.estimate(frame) → Returns pose landmarks
MediaPipe pose connections available

Video API

VideoHandler(source) → Initialize video capture
VideoHandler.read() → Get next frame
VideoHandler.get_properties() → Video metadata

Contributing

Fork the repository
Create feature branch: git checkout -b feature-name
Run tests: pytest tests/
Submit pull request

License

MIT License - see LICENSE file

Changelog

v0.1.0

Initial release with YOLOv8 + MediaPipe
Docker support
CI/CD pipeline
Test dataset generation

Need Help?

Check the Issues page
Run the Jupyter notebook demo for examples
Review test files for usage patterns

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
configs		configs
data		data
notebooks		notebooks
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

VisionFlow 🚀

Features

Prerequisites

Installation

Option 1: Local Setup

Option 2: Docker Setup

Option 3: Dev Container (VS Code)

Quick Start

1. Webcam Demo

2. Process Video File

3. Train Custom Model

4. Jupyter Notebook Demo

Configuration

Dataset Setup

For Training Custom Models:

For Testing:

Usage Examples

Basic Detection

Pose Estimation

Video Processing

Development

Running Tests

Code Quality

Pre-commit Hooks

Deployment

Edge Devices (Jetson/RPi)

Export Models

Troubleshooting

Common Issues

Performance Optimization

For Real-time Processing:

For Accuracy:

API Reference

Detection API

Pose API

Video API

Contributing

License

Changelog

v0.1.0

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages