Skip to content

lemonsaurus/blackbox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lint & Test Docker Build PyPI version

blackbox

A simple service which magically backs up all your databases to all your favorite cloud storage providers, and then notifies you.

Simply create a config file, fill in some connection strings for your favorite services, and schedule blackbox to run however often you want using something like cron, or a Kubernetes CronJob.

Table of Contents

Setup

This service can either be set up as a cron job (on UNIX systems), as a Kubernetes CronJob, or scheduled in your favorite alternative scheduler.

Quick start

Requires Python 3.9 or newer

# Install the CLI tool
pip install blackbox-cli

# Create a configuration file
blackbox --init

# Run blackbox with a specific config file
blackbox --config=/path/to/blackbox.yaml

To run Blackbox manually in the Poetry environment, run:

poetry run python -m blackbox

Setting up as a cron job

All you need to do to set it up as a cron job is clone this repo, create a config file (see below), and trigger blackbox to run automatically however often you want.

crontab -e

#run backup every hour
0 */1 * * * blackbox --config path/to/blackbox.yml

Setting it up as a Kubernetes CronJob

To set this up as a Kubernetes CronJob, you'll want three manifests and a secret.

Before we start, you'll probably want to create a secret named blackbox-secrets where you expose environment variables containing stuff like passwords for your databases, credentials for your storage, and webhooks as environment variables. We'll be interpolating those into the config file.

Next, we'll need a ConfigMap for the blackbox.yaml config file. See the Configuration section below for more information on what to put inside this file.

# blackbox-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: blackbox-config
data:
  blackbox.yaml: |
    databases:
      mongodb:
        main_mongodb:
            connection_string: mongodb://{{ MONGO_INITDB_ROOT_USERNAME }}:{{ MONGO_INITDB_ROOT_PASSWORD }}@mongodb.default.svc.cluster.local:27017

    storage:
      s3:
        main_s3:
          bucket: blackbox
          endpoint: my.s3.com
          aws_access_key_id: {{ AWS_ACCESS_KEY_ID }}
          aws_secret_access_key: {{ AWS_SECRET_ACCESS_KEY }}

    notifiers:
      discord:
        main_discord:
          webhook: {{ DISCORD_WEBHOOK }}

    retention_days: 7

Finally, we need the CronJob itself. This one is configured to run once a day, at midnight.

# cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: blackbox
spec:
  schedule: "@daily"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: blackbox
              image: lemonsaurus/blackbox
              imagePullPolicy: Always
              envFrom:
                - secretRef:
                    name: blackbox-secrets
              # Tell blackbox where to find the config file.
              env:
              - name: BLACKBOX_CONFIG_PATH
                value: "/blackbox/config_file/blackbox.yaml"
              volumeMounts:
                # Take care not to mount this in the root folder!
                # That will replace everything in the root folder with
                # the contents of this volume, which sucks.
                - mountPath: /blackbox/config_file
                  name: blackbox-config
          volumes:
            - name: blackbox-config
              configMap:
                name: blackbox-config
          restartPolicy: OnFailure
      backoffLimit: 3

Configuration

blackbox configuration is easy. You simply create a yaml file, blackbox.yaml, which contains something like this:

See below for specific configuration information for each handler.

databases:
  postgres: # Database type 
    main_postgres: # Database identifier
      # Configuration (see below for further information on specific databases)
      username: username
      password: password
      host: host
      port: port

      # Optionally specify storage and notifiers to use
      # You can specify them by type or identifier
      # Use a string for a single specifier, a list for multiple specifiers
      storage_providers:
        - s3
        - secondary_dropbox
      notifiers: slack

  redis:
    main_redis:
      password: password
      host: host
      port: port
      # No specified storage and notifiers, so all storage and notifiers are used

storage:
  s3: # Storage type
    main_s3: # Storage identifier
      bucket: bucket
      endpoint: s3.endpoint.com
    secondary_s3:
      bucket: bucket
      endpoint: s3.another_endpoint.com
  dropbox:
    main_dropbox:
      access_token: XXXXXXXXXXX
    secondary_dropbox:
      access_token: XXXXXXXXXXX

notifiers:
  discord: # Notifier type
    main_discord: # Notifier identifier
      webhook: https://discord.com/api/webhooks/797541821394714674/lzRM9DFggtfHZXGJTz3yE-MrYJ-4O-0AbdQg3uV2x4vFbu7HTHY2Njq8cx8oyMg0T3Wk
  slack:
    main_slack:
      webhook: https://hooks.slack.com/services/XXXXXXXXXXX/XXXXXXXXXXX/XXXXXXXXXXXXXXXXXXX

retention_days: 7

# Optional filename format configuration
filename_format: "{database_id}_blackbox_{date}"  # Default format
date_format: "%d_%m_%Y"  # Default: DD_MM_YYYY

Filename Format Configuration

You can customize the format of backup filenames using the filename_format and date_format configuration options:

  • filename_format: Template for the backup filename. Use {database_id} and {date} as placeholders.
  • date_format: Python strftime format for the date portion of the filename.

Examples

# Default format: mydb_blackbox_25_12_2024.sql
filename_format: "{database_id}_blackbox_{date}"
date_format: "%d_%m_%Y"

# Custom format: backup_mydb_2024-12-25.sql  
filename_format: "backup_{database_id}_{date}"
date_format: "%Y-%m-%d"

# Enterprise format: mydb_backup_20241225_daily.sql
filename_format: "{database_id}_backup_{date}_daily"
date_format: "%Y%m%d"

Note: The rotation system supports both legacy and custom filename formats to ensure existing backups continue to be managed properly during the transition.

Blackbox will look for this file in the root folder by default, however you can provide an alternative config file path by creating an environment variable called BLACKBOX_CONFIG_PATH, and set it to the absolute path of the file.

export BLACKBOX_CONFIG_PATH=/var/my/favorite/fruit/blackbox.yaml

You can also specify the location of this file when using the blackbox cli command.

blackbox --config=/path/to/blackbox.yaml

Environment Variables

The blackbox.yaml will ✨ magically interpolate ✨ any environment variables that exist in the environment where blackbox is running. This is very useful if you want to keep your secrets in environment variables, instead of keeping them in the config file in plaintext.

Example

Imagine your current config looks like this, but you want to move the username and password into environment variables.

databases:
  postgres:
    main_postgres:
      username: lemonsaurus
      password: security-is-overrated
      host: localhost
      port: 5432

So we'll create two environment variables like these:

export POSTGRES_USERNAME=lemonsaurus
export POSTGRES_PASSWORD=security-is-overrated

And now we can make use of these environment variables by using double curly brackets, like this:

databases:
  postgres:
    main_postgres:
      username: { { POSTGRES_USERNAME } }
      password: { { POSTGRES_PASSWORD } }
      host: localhost
      port: 5432

Databases

Right now, this app supports MongoDB, PostgreSQL 7.0 or higher, MariaDB, Redis and local storage archiving. If you need support for an additional database, consider opening a pull request to add a new database handler.

To configure databases, add a section with this format:

databases:
  database_type:
    # More than one of each database type can be configured
    identifier_1:
      field: value
    identifier_2:
      field: value
  database_type:
    ...

See below for the specific database types available and fields required. Identifiers can be any string of your choosing.

MongoDB

  • Database Type: mongodb
  • Required fields: connection_string
  • The connection_string field is in the format mongodb://username:password@host:port
  • To restore from the backup, use mongorestore --gzip --archive=/path/to/backup.archive
  mongodb:
    main_mongo:
      connection_string: "mongodb://blackbox:blackbox@mongo:27017"

PostgreSQL

  • Database Type: postgres
  • Required fields: username, password, host, port
  • To restore from the backup, use psql -f /path/to/backup.sql
  postgres:
    main_postgres:
      username: blackbox
      password: blackbox
      host: postgres
      port: "5432"

MariaDB

  • Database Type: mariadb
  • Required fields: username, password, host, port
  • To restore from the backup, use mysql -u <user> -p < db_backup.sql
  mariadb:
    main_mariadb:
      username: root
      password: example
      host: maria
      port: "3306"

MySQL

  • Database Type: mysql
  • Required fields: username, password, host, port
  • To restore from the backup, use mysql -u <user> -p < db_backup.sql
  mysql:
    main_mysql:
      username: root
      password: example
      host: mysql
      port: "3306"

Redis

  • Database Type: redis
  • Required fields: password, host, port
  redis:
    main_redis:
      password: blackbox
      host: redis
      port: "6379"

Local storage

  • Database type: localstorage
  • Required field: path
  • Optional field: compression_level
  • The compression level must be an integer between 0 and 9.
  • The archive will contain the full structure, starting from the root folder.
  localstorage:
    main_localstorage:
      path: /path/to/folder
      compression_level: 7

To restore from the backup

  • Stop Redis server.
  • Turn off appendonly mode in Redis configuration (set to no).
  • Copy backup file to Redis working directory (dir in configuration) with name that is defined in configuration key dbfilename.
  • Set backup permissions.
sudo chown redis:redis <path-to-redis-dump-file>
sudo chmod 660 <path-to-redis-dump-file>
  • Start Redis server.

If you want to re-enable appendonly:

  • Login with redis-cli.
  • Run BGREWRITEAOF.
  • Exit from Redis CLI (with exit).
  • Stop Redis server.
  • Set appendonly to yes in Redis configuration.
  • Start Redis server.

Specify Storage providers and Notifiers for each Database

To specify specific storage providers or notifiers for databases, add the fields storage_providers and notifiers under each database entry. The entry can be a list or a string.

databases:
  postgres: # Database type 
    main_postgres: # Database identifier
      username: username
      password: password
      host: host
      port: port

      storage_providers:
        - s3
        - secondary_dropbox
      notifiers: slack

The above example will backup main_postgres to every s3 storage provider configured, as well as the storage provider with the identifier secondary_dropbox. Then, only the slack notifier gets notified.

These fields are optional. If not given, all storage providers and all notifiers will be used.

Storage providers

Blackbox can work with different storage providers to save your logs and backups - usually so that you can automatically store them in the cloud. Right now we support S3 and Dropbox.

To configure storage providers, add a section with this format:

storage:
  storage_type:
    # More than one of each storage provider type can be configured
    identifier_1:
      field: value
    identifier_2:
      field: value
  storage_type:
    ...

S3

We support any S3 object storage bucket, whether it's from AWS, Linode , DigitalOcean, Scaleway, or another S3-compatible object storage provider.

Blackbox will respect the retention_days configuration setting and delete older files from the S3 storage. Please note that if you have a bucket expiration policy on your storage, blackbox will not do anything to disable it. So, for example, if your bucket expiration policy is 12 hours and blackbox is set to 7 retention_days, then your backups are all gonna be deleted after 12 hours unless you disable your policy.

S3 configuration

  • Storage Type: s3
  • Required fields: bucket, endpoint
  • Optional fields: aws_access_key_id, aws_secret_access_key, client_config
  • The endpoint field can look something like this: s3.eu-west-1.amazonaws.com

Credentials

To upload stuff to S3, you'll need credentials. Your AWS credentials can be provided in several ways. This is the order in which blackbox looks for them:

  • First, we look for the optional fields in the s3 configuration, called aws_access_key_id and aws_secret_access_key.
  • If these are not found, we'll check if the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables are declared in the local environment where Blackbox is running.
  • If we can't find these, we'll look for an .aws/config file in the local environment.
  • NOTE: If the bucket is public, no credentials are necessary.

Backblaze B2 Compatibility

For Backblaze B2 users using boto3 versions 1.35.99 or higher, you may encounter compatibility issues due to newer data integrity protection headers. To resolve this, use the client_config option:

storage:
  s3:
    backblaze_b2:
      bucket: my-b2-bucket
      endpoint: s3.us-west-004.backblazeb2.com
      aws_access_key_id: {{ B2_ACCESS_KEY_ID }}
      aws_secret_access_key: {{ B2_SECRET_KEY }}
      client_config:
        request_checksum_calculation: when_required
        response_checksum_validation: when_required

This configuration disables the problematic checksum headers that Backblaze B2 doesn't support, while maintaining compatibility with other S3-compatible services.

Dropbox

  • Storage Type: dropbox
  • Required fields: access_token
  • Optional fields: upload_directory

The Dropbox storage handler needs a user access token in order to work. To get one, do the following:

  • Create a Dropbox account (if you don't already have one).
  • Go to https://dropbox.com/developers
  • Create a new application with App Folder access. Do not give it full access, as this may have dangerous, destructive consequences if configured incorrectly.

You can also define a custom location (root is App Folder) using the upload_directory optional parameter. This should begin with slash and must end with slash. Default is root.

Google Drive

  • Storage Type: googledrive
  • Required fields: refresh_token, client_id, client_secret
  • Optional fields: upload_directory

The Google Drive storage handler needs a refresh token, client ID, and client secret in order to work. To get these, do the following:

  • Create a Google account (if you don't already have one).
  • Go to https://console.cloud.google.com
  • Create a project
  • In the OAuth Overview page, click Get Started
  • Follow the prompts and fill out the presented forms
  • When you're finished, go to the Clients tab, and click Create Client
  • Select Web Application for Application Type
  • The required scopes are: /auth/drive.file and /auth/drive.appdata
  • Click your newly created client to view the client ID and secret
  • Use your client ID and secret to obtain a refresh token using the tool of your choice, such as Postman
  • Make sure you add your own email as a user in the Audience tab

If you decide to use Postman to obtain your refresh token:

  • Ensure you set the authorized redirect URI to https://oauth.pstmn.io/v1/callback in your Google client configuration
  • Set the Auth URL to https://accounts.google.com/o/oauth2/v2/auth?access_type=offline&prompt=consent in Postman to ensure Google responds with a refresh token, and not only an access token
  • Read the Postman authorization docs

You can define a custom location in which to store backups (the default is the root folder) using the upload_directory optional parameter. This should be in the format Cool/Example. Any folders in the path that do not already exist will be created for you.

Notifiers

blackbox also implements different notifiers, which is how it reports the result of one of its jobs to you. Right now we only support the below listed notifiers, but if you need a specific notifier, feel free to open an issue.

To configure notifiers, add a section with this format:

notifiers:
  notifier_type:
    # More than one of each notifier type can be configured
    identifier_1:
      field: value
    identifier_2:
      field: value
  notifier_type:
    ...

Discord

  • Notifier Type: discord
  • Required fields: webhook
  • The webhook field usually looks like https://discord.com/api/webhooks/797541821394714674/lzRM9DFggtfHZXGJTz3yE-MrYJ-4O-0AbdQg3uV2x4vFbu7HTHY2Njq8cx8oyMg0T3Wk
  • We also support ptb.discord.com and canary.discord.com webhooks.

blackbox blackbox

Slack

  • Notifier Type: slack
  • Required fields: webhook
  • The webhook field usually looks like https://hooks.slack.com/services/XXXXXXXXXXX/XXXXXXXXXXX/XXXXXXXXXXXXXXXXXXX

Slack notifiers have 2 styles: legacy attachment (default) and modern Block Kit version. To enable Block Kit version, set the optional field use_block_kit to anything.

Default:

blackbox blackbox

Modern:

blackbox blackbox

Telegram

  • Notifier Type: telegram
  • Required fields: token, chat_id
  • YAML will look like this:
  telegram:
    telegram_1:
      token: {{ TELEGRAM_TOKEN }}
      chat_id: {{ TELEGRAM_CHAT_ID }}
  • You can create a bot and get a bot token using the BotFather account in Telegram. Follow these instructions.
  • You can find your chat_id by using the userinfobot account in Telegram. Just /start the bot.
  • Do not forget to /start your own bot to grant sending permissions.

blackbox blackbox

Json

  • Notifier Type: json
  • Required fields: url
  • YAML will look like this:
  json:
    json_1:
      url: https://mydomain.com/api/blackbox-notifications

Note: All notifications are sent as a POST request because the data will live within the request's body.

The body of the HTTP request will look like this

{
    "backup-data": [
        {
            "source": "main_postgres",
            "success": true,
            "output": "",
            "backup": [
                {
                    "name": "main_dropbox",
                    "success": true
                }
            ]
        }
    ]
}

Rotation

Blackbox supports multiple rotation strategies using cron expressions, as well as rotation via the legacy retention_days configuration. To determine if something is a backup file or not, it will use a regex pattern that corresponds with the default file it saves, for example blackbox-postgres-backup-11-12-2020.sql.

You can configure the number of days before rotating by altering the retention_days parameter in blackbox.yaml. Or, you can configure rotation_strategies for any or each storage handler. The rotation_strategies can be different for different storage handlers.

If both retention_days and rotation_strategies are configured, then any backups made within the retention_days window will be retained, regardless of the configured rotation_strategies. Any backups made outside this window will adhere to the corresponding rotation_strategies configuration.

If neither rotation_strategies nor retention_days is configured, Blackbox will retain all backups.

This cron expression generator tool may be useful to you for configuring rotation strategies if you are unfamiliar with cron expressions. Blackbox supports ,, -, and / notation, and uses digits 1-7 to represent weekdays, with 1 representing Monday and 7 or (0) representing Sunday.

The optional sixth parameter

Blackbox will accept an optional sixth parameter for each cron expression, representing the number of matching backups to retain. These will be the most recent backups. For example:

* * * * 1 5        ---   Retain 5 backups made on a Monday
30 0-12 * * * 10   ---   Retain 10 backups made on the 30th minute of any hour from 0-12
* * * 5 * 3        ---   Retain 3 backups made during May
* * * 6 * 0        ---   Don't retain any backups made in June
11 12/15 * 1,2,3 * ---   Retain ALL backups made at 12:11 or 15:11 in Jan, Feb, or Mar
* * * * 2 *        ---   Retain ALL backups made on a Tuesday

If two strategies overlap, Blackbox will use the higher configured maximum (sixth parameter) when determining whether to retain or remove the backup. Using the examples above, if a backup were created at 11:30 on any day in May, and if fewer than 10 backups created on the 30th minute of any hour from 0-12 were already retained, this May backup would also be retained, even if more than three May backups had previously been retained.

Syntax

The rotation_strategies configuration is added to the storage handler options like so:

storage:
  googledrive:
    main_gdrive:
      rotation_strategies:
        - "* * 1 * * 1"
        - "* * * * 7 3"
      refresh_token: 123
      client_id: 123
      client_secret: 123
      upload_directory: Blackbox

Encryption

Blackbox supports password-based encryption of backup files for enhanced security. Encrypted backups are compressed and then encrypted using Fernet symmetric encryption (AES 128 in CBC mode with HMAC) with PBKDF2 key derivation.

Configuration

Add encryption configuration to your blackbox.yaml:

# Global encryption (applies to all storage providers)
encryption:
  method: password
  password: YourVeryStrongPassword123

# Or per-storage provider
storage:
  s3:
    main_s3:
      bucket: my-bucket
      endpoint: s3.amazonaws.com
      encryption:
        method: password
        password: YourVeryStrongPassword123

Password Requirements

  • Minimum 14 characters
  • Must contain at least 2 of: uppercase letters, lowercase letters, numbers
  • Keep passwords secure - lost passwords result in permanently inaccessible backups

Decrypting Backups

Using Blackbox CLI

# Decrypt an encrypted backup file
blackbox decrypt /path/to/backup.sql.enc

# Specify custom output path
blackbox decrypt /path/to/backup.sql.enc --output /path/to/restored.sql

# The CLI will prompt for the password

Manual Decryption with Python

If you need to decrypt backups manually without the blackbox CLI:

import base64
import gzip
from pathlib import Path
from cryptography.fernet import Fernet
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC

def decrypt_backup(encrypted_file_path, password, output_path=None):
    """Manually decrypt a blackbox encrypted backup file."""
    
    # Derive the same key that blackbox uses
    salt = b'blackbox_backup_salt_v1'
    kdf = PBKDF2HMAC(
        algorithm=hashes.SHA256(),
        length=32,
        salt=salt,
        iterations=100000,
    )
    key = base64.urlsafe_b64encode(kdf.derive(password.encode()))
    fernet = Fernet(key)
    
    # Read and decrypt the file
    with open(encrypted_file_path, 'rb') as f:
        encrypted_data = f.read()
    
    decrypted_compressed_data = fernet.decrypt(encrypted_data)
    
    # Decompress the data
    decrypted_data = gzip.decompress(decrypted_compressed_data)
    
    # Determine output path
    if output_path is None:
        output_path = Path(encrypted_file_path).with_suffix('')
    
    # Write decrypted data
    with open(output_path, 'wb') as f:
        f.write(decrypted_data)
    
    return output_path

# Example usage
decrypt_backup('/path/to/backup.sql.enc', 'YourPassword123')

Security Notes:

  • Encryption uses a fixed salt for consistency across environments
  • Temporary files are securely overwritten during cleanup
  • Always use strong passwords and keep them secure

Cooldown

By default, blackbox will send all notification at every backup attempt. You can specify a cooldown period in blackbox.yaml during which all notifications will be muted. This option will not mute failed backups.

Example usage

cooldown: 120s
cooldown: 3 hours
cooldown: 2 days 4 hours
cooldown: 4h 32M 16s

About

Magically save your database backups and critical logs in your favorite cloud storage provider.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 14

Languages