- Bellevue, WA, USA
-
10:07
(UTC -12:00) - https://linkedin.com/in/valtrof
Pinned Loading
-
pipeline-doctor
pipeline-doctor PublicLLM-powered data quality system: detects anomalies (nulls, zeros, outliers) in BigQuery datasets using statistical analysis, then calls Anthropic Claude to generate natural-language diagnosis and e…
Jupyter Notebook
-
curriculum-engine
curriculum-engine PublicHybrid RAG pipeline: Claude Haiku generates a structured learning plan with search queries; live YouTube Data API + Serper.dev retrieve and validate every resource link. Prompt caching cuts input t…
Python
-
customer-event-pipeline
customer-event-pipeline Public3-stage Kubeflow Pipelines (KFP v2) ML pipeline on Vertex AI: BigQuery extraction → feature engineering → scikit-learn model training with full artifact lineage tracking. Components run in isolated…
Python
-
snowflake-pipeline
snowflake-pipeline PublicEnd-to-end ELT pipeline: Python → Snowflake → dbt (3-layer modeling) → Apache Airflow orchestration. Incremental loading, data quality tests, GitHub Actions CI.
Python
-
dbt-data-platform
dbt-data-platform PublicProduction-grade dbt data warehouse on BigQuery: 3-layer modeling (staging → intermediate → mart), fact table partitioned by date and clustered by borough, automated data quality tests, dbt docs wi…
Python
-
customer-event-flink
customer-event-flink PublicPyFlink streaming pipeline: Kafka → 5-minute tumbling event-time windows → per-customer spend aggregation. Handles late-arriving events with watermarks and BoundedOutOfOrderness strategy.
Python
If the problem persists, check the GitHub status page or contact support.

