Skip to content
View ShamsRupak's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report ShamsRupak

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ShamsRupak/README.md

Typing SVG

       



Records/sec
Event Streaming

Bytecode Opcodes
Custom Compiler

Tests Across
All Projects

LRU Eviction
Cache Server

Agent Orchestration
Platform

Students Mentored
in C++ Systems

🧱 I don't use frameworks to learn — I build from scratch.

Compiler → lexer, Pratt parser, type inference, 28-opcode bytecode compiler, stack VM with mark-sweep GC (C++20)

Streaming Engine → commit log, TCP broker, wire protocol, producer/consumer SDKs, LZ4 compression (Rust)

Transformer → GPT architecture from scratch — RoPE, RMSNorm, BPE tokenizer, MQA ablation (PyTorch)

Cache Server → O(1) LRU eviction, TTL, sharded thread-safe storage, benchmarked latency (C++20)


 About Me

class ShamsRupak {
public:
    string role      = "Software Engineer | AI/ML Engineer";
    string location  = "New York, NY";
    string education = "M.S. Engineering AI @ Stony Brook";
    string current   = "Teaching Assistant — C++ OOP";
    
    vector<string> languages = {
        "C++20", "Rust", "Python", "Java",
        "SQL", "JavaScript", "C"
    };
    
    vector<string> systems = {
        "Compilers", "Event Streaming",
        "TCP Servers", "Bytecode VMs",
        "Concurrency", "Memory Mgmt (RAII)",
        "Mark-Sweep GC", "Wire Protocols"
    };
    
    vector<string> backend = {
        "FastAPI", "REST APIs", "JWT/OAuth2",
        "Redis", "PostgreSQL", "Docker",
        "Prometheus", "WebSockets", "CI/CD"
    };
    
    vector<string> ml_ai = {
        "PyTorch", "Transformers", "RAG",
        "LoRA Fine-Tuning", "LLM Agents",
        "Drift Detection", "NLP", "OCR"
    };

    string status() {
        return "Open to SWE / ML roles — 2026 🚀";
    }
};

 Tech Stack

 ⚡ Languages

 🔧 Systems & Backend

 🧪 Testing & Quality

 🧠 AI / ML / Data Science

 🌐 Web & Tools

 Featured Projects

Kafka-inspired event streaming engine built from scratch in Rust. Commit log, TCP broker, custom wire protocol, producer/consumer SDKs, LZ4 compression, consumer groups. 304K records/sec throughput.

Programming language built from scratch in C++20. Lexer, Pratt parser, type inference, 28-opcode bytecode compiler, stack-based VM with mark-sweep garbage collector. 6 example programs.

Enterprise AI agent orchestration with 6 modules (core/connect/train/eval/observe/api). LoRA fine-tuning for Qwen-2.5, structured evaluation pipelines. 207 tests, 34 commits.

Real-time ML monitoring with PSI drift detection, Prometheus metrics, WebSocket live updates. FastAPI + React. Live Demo →

Redis-inspired TCP cache server with O(1) LRU eviction, TTL expiration, sharded thread-safe storage. Benchmarked p50/p95/p99 latency. Cross-platform CI with sanitizer builds.

GPT transformer built from scratch in PyTorch. RoPE positional encoding, RMSNorm, BPE tokenizer, multi-query attention ablation study. 112 tests.

Autonomous research agent with DAG-based hierarchical planner, ReAct tool-use loop, evidence critic scoring, provider-agnostic LLM failover. 59 tests.

REST API with JWT auth, RBAC, PostgreSQL data modeling & migrations, Redis caching & rate limiting, health/metrics endpoints. Containerized with Docker Compose + automated CI.

End-to-end pipeline processing 1,000+ financial docs with intelligent classification, OCR extraction, and embedding-based semantic retrieval. 60% throughput increase.

Full ML pipeline: EDA → feature engineering → model training. Accurate temperature trend predictions for NYC metro.

 Experience

From retail floor to building compilers, streaming engines, and AI platforms.

📅 Full Timeline
═══════════════════════════════════════════════════════════════════════════════
 Jan 2026 – Now      🎓  Teaching Assistant — C++ OOP         @ Stony Brook
 Aug 2025 – Oct 2025 🍎  Sales Specialist                     @ Apple
 May 2025 – Jul 2025 🤖  AI Engineering & Automation Extern   @ Outamation
 Jul 2024 – May 2025 📱  Retail Mobile Expert                 @ T-Mobile
 Sep 2024 – Nov 2024 🔐  Web3 Security Data Analytics Extern  @ Webacy
═══════════════════════════════════════════════════════════════════════════════
🎓 Teaching Assistant — C++ OOP  |  Stony Brook University  Jan 2026 – Present

Mentoring 30+ students in systems-level C++ — covering OOP design patterns, STL containers, dynamic memory management, RAII, and pointer safety. Debugging segfaults, memory leaks, and logic errors. Conducting code reviews emphasizing correctness, modularity, and performance.

🍎 Sales Specialist  |  Apple  Aug 2025 – Oct 2025

Delivered the Apple retail experience — product consultation, hands-on demos, and technical guidance across the full Apple ecosystem. Consistent top performer in customer satisfaction and sales metrics.

🤖 AI Engineering & Automation Extern  |  Outamation  May 2025 – Jul 2025
  • Designed and deployed a modular AI document processing pipeline (Python, PyMuPDF, Tesseract OCR, NLP, LLM classification) across 1,000+ financial documents — cutting manual processing time by 60%
  • Built a RAG-powered retrieval system with LlamaIndex, contextual chunking, and embedding-based semantic search
  • Benchmarked transformer models across latency, context window, and precision trade-offs for production deployment
📱 Retail Mobile Expert  |  T-Mobile  Jul 2024 – May 2025

Consultative sales across T-Mobile's full product ecosystem — devices, plans, home internet, and accessories. Diagnosed technical issues, performed device troubleshooting, and delivered personalized solutions in a high-volume retail environment.

🔐 Web3 Security Data Analytics Extern  |  Webacy  Sep 2024 – Nov 2024

Applied unsupervised ML & clustering to detect anomalous blockchain transactions and smart contract vulnerabilities, achieving 95%+ accuracy in labeled dataset reliability for risk categorization.

 Education

Stony Brook University — M.S. Engineering Artificial Intelligence (Aug 2025 – Dec 2026)
DSA · OOP (C++) · Software Engineering · Machine Learning · Deep Learning · AI for Robotics
Stony Brook University — B.S. Applied Mathematics & Statistics (Aug 2021 – May 2025)
Applied Mathematics · Probability Theory · Linear Algebra · Statistics · OOP

 GitHub Analytics

 

github contribution snake animation

 What I Build


Systems from Scratch
Compilers · Streaming Engines
VMs · Garbage Collectors

Production Backend
FastAPI · Docker · PostgreSQL
Redis · Prometheus · CI/CD

AI/ML Infrastructure
Agent Orchestration · LoRA
Drift Detection · MLOps

 Let's Connect

 ╔══════════════════════════════════════════════════════════════════╗
 ║                                                                  ║
 ║   🚀  Open to Software Engineering & ML Engineering roles        ║
 ║   📍  Based in New York — open to relocation & remote            ║
 ║   🔧  C++ · Rust · Python · Systems · ML · Databases             ║
 ║                                                                  ║
 ╚══════════════════════════════════════════════════════════════════╝

     



@ShamsRupak — shipping code, shipping systems, shipping results.

Pinned Loading

  1. pulseapi pulseapi Public

    FastAPI backend with JWT authentication, PostgreSQL, Redis caching, and rate limiting.

    Python 1

  2. insurance-risk-and-claims-modeling insurance-risk-and-claims-modeling Public

    End-to-end data science project focused on insurance risk, claim frequency, and severity modeling using real-world datasets, with emphasis on statistical rigor, interpretability, and business impact.

    Jupyter Notebook 1

  3. cachecraft cachecraft Public

    A Redis-inspired concurrent in-memory cache server built in C++20 with thread-safe storage, O(1) LRU eviction, and TTL expiration over TCP.

    C++ 1

  4. student-buddy-extension student-buddy-extension Public

    AI-powered learning companion Chrome extension that provides hints and step-by-step guidance

    JavaScript 1

  5. ai-doc-processing-suite ai-doc-processing-suite Public

    End-to-end ML pipeline for financial document classification (100% accuracy), hybrid FAISS+BM25 retrieval (MRR 0.772), and SQL analytics tracking. Processes 1,200+ documents with TF-IDF classificat…

    Python 1 1

  6. shams-rupak-ai-engineer shams-rupak-ai-engineer Public

    Personal portfolio — React, TypeScript, Tailwind CSS

    TypeScript 1