A privacy-focused, in-browser OCR tool for the Mon language (mnw), built with Rust and WebAssembly.
MonOCR Web brings optical character recognition for the Mon script directly to the browser. By leveraging ONNX Runtime Web and a custom Wasm backend, it performs all processing locally on the user's device. This ensures zero data latency and complete privacy—no images are ever sent to a server.
- Local Processing: Runs entirely in the browser using WebAssembly.
- Privacy First: No data collection by default; all OCR processing is 100% local.
- Optional Cloud Sync: Secure, opt-in synchronization for users who wish to contribute their corrected scans to the Mon language dataset.
- High Performance: Optimized MobileNetV3 + BiLSTM OCR engine via ONNX Runtime (~6.6M parameters).
- Large File Support: Supports PDFs and images up to 50MB.
- Mon Language Support: Specialized for recognizing Mon script.
- Premium UX: High-fidelity skeleton loaders and synchronized design system (16px radii, 24px spacing).
MonOCR is a cross-platform ecosystem designed for parity and performance:
- MonOCR Web: (This Repository) Privacy-first in-browser OCR.
- MonOCR Android: Native Jetpack Compose app with Material 3.
- MonOCR iOS: Native SwiftUI app with SwiftData persistence.
This project is certified Production Ready and strictly adheres to:
- Product Quality Constitution: Compact, Calm, Modern.
- Privacy-First Engineering: 100% on-device processing.
- Design System Convergence: Identical corner radii, spacing, and typography across all screens.
- Real-World Feedback: Integrated unified feedback bridges for model improvement.
- Compliance: Designed for GDPR and CCPA alignment with transparent opt-in data contribution.
- Hugging Face Models (CKPT, ONNX, ML, RTen)
pnpm installTo run locally, we need to copy the pre-built ONNX Runtime WASM files from node_modules to static/wasm/.
pnpm run copy-wasmpnpm dev
# Note: This automatically runs copy-wasm before startingTo create a production build (static site):
pnpm buildNote: The build script automatically removes the large monocr.onnx model from the output to comply with Cloudflare's 25MB asset limit. In production, the model is fetched directly from Hugging Face.
This project is optimized for Cloudflare Pages.
- Build Command:
pnpm build - Output Directory:
build - WASM Assets: Included automatically via
static/wasm/(ensure these are committed to git). - Model: Served from Hugging Face (configured in
src/lib/config.ts).
If you have wrangler installed/configured:
npx wrangler deploy(This uses the wrangler.json configuration to deploy the build folder).
To enable the dataset contribution feature, you must configure the following Cloudflare Environment Variables:
R2_ACCESS_KEY_ID: Cloudflare R2 Access Key.R2_SECRET_ACCESS_KEY: Cloudflare R2 Secret Key.R2_ACCOUNT_ID: Your Cloudflare Account ID.R2_BUCKET_NAME: The name of your R2 bucket (default:monocr-dataset).
These can be set in the Cloudflare Pages Dashboard under Settings > Functions > Variables.
MIT