Skip to content

AlanBarber/bitcheck

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BitCheck - Data Integrity Monitor

GitHub Release GitHub Actions Workflow Status GitHub Downloads (all assets, all releases) GitHub License


Monitor your data for silent corruption (bitrot) with automated file integrity checking.

BitCheck is a fast, cross-platform CLI tool that detects file corruption by tracking file hashes over time. Perfect for monitoring important documents, photos, backups, and archives for gradual data degradation.

Why BitCheck?

  • 🛡️ Detect corruption early - Find bitrot before it's too late
  • Lightning fast - Processes thousands of files in seconds
  • 🎯 Simple to use - Just three commands: add, check, update
  • 🧠 Smart checking - Automatically distinguishes intentional edits from corruption
  • 🔒 Safe & reliable - Gracefully handles locked files and permission issues
  • 📁 Per-directory tracking - Each folder maintains its own database
  • 🌍 Cross-platform - Works on Windows, Linux, and macOS

Quick Start

1. Download

Get the latest release for your platform from the Releases page:

Platform Download
Windows bitcheck-win-x64.exe
Linux bitcheck-linux-x64
macOS (Intel) bitcheck-osx-x64
macOS (Apple Silicon) bitcheck-osx-arm64

2. Make Executable (Linux/macOS only)

chmod +x bitcheck-linux-x64  # or bitcheck-osx-x64 or bitcheck-osx-arm64

3. Start Monitoring Your Files

# Add all files in current directory to database
bitcheck --add --recursive

# Check for corruption
bitcheck --check --recursive

That's it! BitCheck will create a .bitcheck.db file in each directory to track file integrity.

How It Works

BitCheck creates a .bitcheck.db file in each directory containing hash fingerprints of your files. When you run a check, it recomputes the hashes and compares them to detect any changes or corruption.

Smart Check Mode (Default)

BitCheck uses smart checking by default to distinguish between intentional file changes and corruption:

  • Intentional changes: If a file's hash changes AND its modification date changed, BitCheck treats it as an intentional edit and automatically updates the hash
  • Corruption detected: If a file's hash changes BUT its modification date is unchanged, BitCheck reports it as possible corruption (bitrot)

This makes BitCheck practical for real-world use where files are frequently edited, while still catching true corruption.

Use --strict mode if you want to report all hash mismatches as corruption, regardless of modification date.

Basic Commands

Command Purpose
bitcheck --add Add new files to the database
bitcheck --check Check files for corruption
bitcheck --update Update hashes for intentionally modified files
bitcheck --add --check Add new files AND check existing ones

Command Options

  • -a, --add - Add new files to the database
  • -c, --check - Check files against stored hashes (smart mode by default)
  • -u, --update - Update hashes for files that have changed
  • -r, --recursive - Process subdirectories
  • -v, --verbose - Show detailed output
  • -s, --strict - Strict mode: report all hash mismatches as corruption
  • --help - Show help information

Usage Examples

Monitor Your Files (First Time)

# Add all files in current directory
bitcheck --add

# Add all files recursively
bitcheck --add --recursive

Output (single directory):

BitCheck - Data Integrity Monitor
Mode: Add 
Recursive: False

[ADD] document.pdf
[ADD] photo.jpg
[ADD] data.xlsx

=== Summary ===
Files processed: 3
Files added: 3
Files skipped: 0
Time elapsed: 0.15s

Output (recursive with multiple directories):

BitCheck - Data Integrity Monitor
Mode: Add 
Recursive: True

Directory: /home/user/documents
[ADD] report.pdf
[ADD] notes.txt

Directory: /home/user/documents/photos
[ADD] vacation.jpg
[ADD] family.png

=== Summary ===
Files processed: 4
Files added: 4
Files skipped: 0
Time elapsed: 0.18s

Check for Corruption (Regular Use)

# Check all files in current directory
bitcheck --check

# Check all files recursively
bitcheck --check --recursive

Output (all OK):

BitCheck - Data Integrity Monitor
Mode: Check 
Recursive: False

=== Summary ===
Files processed: 3
Files checked: 3
Mismatches: 0
Files skipped: 0
Time elapsed: 0.12s

Output (intentional file change - smart mode):

BitCheck - Data Integrity Monitor
Mode: Check 
Recursive: False

[UPDATED] document.pdf - File was modified (2025-11-07 04:36:26 UTC)

=== Summary ===
Files processed: 3
Files checked: 3
Mismatches: 0
Files skipped: 0
Time elapsed: 0.12s

Output (recursive mode with changes in subdirectories):

BitCheck - Data Integrity Monitor
Mode: Check 
Recursive: True

Directory: /home/user/documents/reports
[UPDATED] quarterly.pdf - File was modified (2025-11-07 04:36:26 UTC)

Directory: /home/user/documents/archives
[UPDATED] backup.zip - File was modified (2025-11-07 04:38:15 UTC)

=== Summary ===
Files processed: 15
Files checked: 15
Mismatches: 0
Files skipped: 0
Time elapsed: 0.45s

Output (corruption detected - modification date unchanged):

BitCheck - Data Integrity Monitor
Mode: Check 
Recursive: False

[MISMATCH] data.xlsx
  Expected: A1B2C3D4E5F6G7H8
  Got:      X9Y8Z7W6V5U4T3S2
  File modification date unchanged: 2025-11-05 12:00:00 UTC
  Possible corruption detected!

=== Summary ===
Files processed: 3
Files checked: 3
Mismatches: 1
Files skipped: 0
Time elapsed: 0.12s

WARNING: 1 file(s) failed integrity check!

Strict Mode (Report All Changes as Corruption)

# Use strict mode to report all hash mismatches, even if file was modified
bitcheck --check --strict

# Useful for read-only media or when you want maximum sensitivity

Output:

BitCheck - Data Integrity Monitor
Mode: Check 
Recursive: False

[MISMATCH] document.pdf
  Expected: A1B2C3D4E5F6G7H8
  Got:      F1E2D3C4B5A69788
  Last successful check: 2025-11-07 04:36:31 UTC

=== Summary ===
Files processed: 3
Files checked: 3
Mismatches: 1
Files skipped: 0
Time elapsed: 0.13s

WARNING: 1 file(s) failed integrity check!

Manual Update (When Needed)

# Manually update hashes after checking
bitcheck --check --update

# Useful in strict mode or for batch updates

Add New Files

# Add new files without checking existing ones
bitcheck --add --verbose

Output:

BitCheck - Data Integrity Monitor
Mode: Add 
Recursive: False

Processing: C:\MyFolder
[ADD] newfile.txt
[SKIP] document.pdf - Already in database
[SKIP] photo.jpg - Already in database

=== Summary ===
Files processed: 3
Files added: 1
Files skipped: 2
Time elapsed: 0.08s

Maintenance Mode

# Add new files AND check existing ones
bitcheck --add --check --recursive

# Most comprehensive: add, check, and update
bitcheck --add --check --update --recursive

Best Practices

  1. Run checks regularly - Schedule weekly or monthly integrity checks
  2. Use --recursive - Process entire directory trees at once
  3. Keep databases with data - The .bitcheck.db files should stay with their folders
  4. Backup databases - Include .bitcheck.db in backups to preserve history
  5. Use --verbose for troubleshooting - See exactly what's being processed

What Gets Checked?

BitCheck automatically processes all regular files and skips:

  • Hidden files (files starting with . on Unix/Linux/macOS, or with Hidden attribute on Windows)
  • Database files (.bitcheck.db)
  • Inaccessible files (locked, permission denied, I/O errors)

Files that cannot be accessed are gracefully skipped and counted in the summary.

Missing File Detection

BitCheck automatically detects files that are in the database but no longer exist:

  • Check mode (--check): Reports missing files with [MISSING] tag
  • Update mode (--update): Removes missing files from the database with [REMOVED] tag
  • Summary: Shows count of missing/removed files

This helps you identify deleted files and keep your database clean.

Automation Examples

Windows Task Scheduler

# Check all files weekly
bitcheck.exe --check --recursive

Linux Cron

# Check all files daily at 2 AM
0 2 * * * cd /data && /usr/local/bin/bitcheck --check --recursive

Backup Verification Script

#!/bin/bash
# Verify backup integrity (use --strict since backups shouldn't change)
cd /backup/location
bitcheck --check --recursive --strict
if [ $? -ne 0 ]; then
    echo "Backup integrity check FAILED!" | mail -s "Backup Alert" [email protected]
fi

FAQ

Q: How often should I run checks?
A: Weekly or monthly checks are recommended for important data. Daily checks for critical systems.

Q: What happens if corruption is detected?
A: BitCheck reports the corrupted files. You should restore them from backups immediately.

Q: How does smart check mode work?
A: By default, BitCheck distinguishes intentional file edits from corruption by checking the file's modification date. If the hash changes but the modification date also changed, it's treated as an intentional edit and auto-updated. If the hash changes but the modification date is unchanged, it's reported as possible corruption.

Q: When should I use strict mode?
A: Use --strict for read-only media (like archived backups or media libraries) where files should never change, or when you want maximum sensitivity to any changes.

Q: Can I use this for backups?
A: Yes! Run bitcheck --add --recursive after creating a backup, then check it regularly. Use --strict mode for backup verification since backup files shouldn't change.

Q: Does it modify my files?
A: No. BitCheck only reads files to compute hashes. It never modifies your data.

Q: What's the performance impact?
A: Minimal. XXHash64 is 10x faster than MD5 and 20x faster than SHA-256, with very low memory usage.

Q: What happens to deleted files?
A: During --check, deleted files are reported as [MISSING]. Use --update to remove them from the database.


For Developers

Technical Details

  • Hash Algorithm: XXHash64 (fast, non-cryptographic)
  • Database: JSON with in-memory Dictionary cache
  • Concurrency: Thread-safe with lock-based synchronization
  • Platform: Cross-platform (.NET 9.0)
  • Testing: 62+ unit tests with MSTest framework

Build from Source

Requires .NET 9.0 SDK:

# Clone repository
git clone https://github.com/alanbarber/bitcheck.git
cd bitcheck

# Build
dotnet build -c Release src/BitCheck.sln

# Run tests
dotnet test src/BitCheck.sln

# Publish self-contained executable
dotnet publish src/BitCheck/BitCheck.csproj -c Release -r win-x64 --self-contained

Test Coverage

The project includes 62+ comprehensive unit tests covering:

  • Database operations (CRUD, persistence, caching)
  • File hashing (XXHash64 consistency and accuracy)
  • Hidden file and directory filtering (cross-platform)
  • File access and error handling (locked files, permissions, I/O errors)
  • Missing file detection and removal
  • Data models and validation

Performance Characteristics

  • Hashing speed: XXHash64 is ~10x faster than MD5 and ~20x faster than SHA-256
  • Memory usage: Minimal per-file overhead
  • Lookup time: O(1) dictionary lookups
  • Startup time: Instant (lazy loading)
  • Disk-bound: Performance primarily limited by disk read speed, not CPU

Operation Logic

Add Mode (--add)

  • New files: Added to database with current hash
  • Existing files: Skipped (unless combined with other modes)

Check Mode (--check)

  • New files: Skipped (use --add to include them)
  • Existing files: Hash computed and compared
    • Match: Updates LastCheckDate (silent unless --verbose)
    • Mismatch: Reports error with both hashes

Update Mode (--update)

  • Standalone: Updates hash for any file that differs from database
  • With --check: Only updates after reporting mismatch
  • New files: Skipped (use --add to include them)

Database Format

  • File: .bitcheck.db (JSON format, hidden on Unix-like systems)
  • Location: One per directory
  • Auto-flush: Changes saved automatically
  • Crash safety: Atomic writes with temp file + rename

Entry structure:

{
  "FileName": "document.pdf",
  "Hash": "A1B2C3D4E5F6G7H8",
  "HashDate": "2025-11-05T12:00:00Z",
  "LastCheckDate": "2025-11-05T12:30:00Z"
}

Documentation

Additional documentation in the docs/ folder:

Contributing

Contributions are welcome! Please ensure:

  • All tests pass (dotnet test)
  • Code follows existing style
  • New features include tests
  • Documentation is updated

License

ISC License - see LICENSE file for details.