Skip to content
This repository was archived by the owner on Mar 7, 2026. It is now read-only.

Disane87/docudigger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

817 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

docudigger - ARCHIVED

This project has been archived and is no longer maintained.

The successor is Scrape Dojo — a much more powerful, flexible, and feature-rich web scraping platform.


Why Scrape Dojo?

Scrape Dojo is the next evolution of docudigger. While docudigger was limited to Amazon invoice scraping, Scrape Dojo is a full-featured, self-hosted web scraping & browser automation platform.

Key Features of Scrape Dojo

  • Declarative JSON/JSONC Workflows — Define scrapes as code, no more writing Puppeteer scripts manually
  • 25+ Built-in Actions — Navigate, click, type, extract, loop, download, screenshot, and more
  • Universal Scraping — Not limited to Amazon; scrape any website with customizable workflows
  • Cron Scheduling & Webhooks — Automate scrapes with cron patterns, webhooks, or startup triggers
  • Handlebars + JSONata Templates — Dynamic templates and powerful data transformations
  • Encrypted Secrets — AES-256-CBC at-rest encryption for credentials
  • Real-time Monitoring — SSE-powered live execution tracking with a modern Angular UI
  • Authentication & SSO — JWT, OIDC/SSO, MFA/TOTP, API keys
  • Multi-Database Support — SQLite (default), MySQL, PostgreSQL
  • Docker-Ready — Easy deployment with Docker Compose
  • Modern Tech Stack — Built with NestJS, Angular, Puppeteer, TypeScript, and Nx

Get Started with Scrape Dojo

docker compose up -d

Full documentation: scrape-dojo.com


Original Project

docudigger was a document scraper for getting invoices automatically as PDF (useful for taxes or DMS). It supported Amazon invoice scraping via CLI or Docker.

Author

Marco Franke

License

MIT

About

Website scraper for getting invoices automagically as pdf (useful for taxes or DMS)

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Sponsor this project

Packages

 
 
 

Contributors