Skip to content
View valtrof's full-sized avatar

Block or report valtrof

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. pipeline-doctor pipeline-doctor Public

    LLM-powered data quality system: detects anomalies (nulls, zeros, outliers) in BigQuery datasets using statistical analysis, then calls Anthropic Claude to generate natural-language diagnosis and e…

    Jupyter Notebook

  2. curriculum-engine curriculum-engine Public

    Hybrid RAG pipeline: Claude Haiku generates a structured learning plan with search queries; live YouTube Data API + Serper.dev retrieve and validate every resource link. Prompt caching cuts input t…

    Python

  3. customer-event-pipeline customer-event-pipeline Public

    3-stage Kubeflow Pipelines (KFP v2) ML pipeline on Vertex AI: BigQuery extraction → feature engineering → scikit-learn model training with full artifact lineage tracking. Components run in isolated…

    Python

  4. snowflake-pipeline snowflake-pipeline Public

    End-to-end ELT pipeline: Python → Snowflake → dbt (3-layer modeling) → Apache Airflow orchestration. Incremental loading, data quality tests, GitHub Actions CI.

    Python

  5. dbt-data-platform dbt-data-platform Public

    Production-grade dbt data warehouse on BigQuery: 3-layer modeling (staging → intermediate → mart), fact table partitioned by date and clustered by borough, automated data quality tests, dbt docs wi…

    Python

  6. customer-event-flink customer-event-flink Public

    PyFlink streaming pipeline: Kafka → 5-minute tumbling event-time windows → per-customer spend aggregation. Handles late-arriving events with watermarks and BoundedOutOfOrderness strategy.

    Python