A lightweight progress tracking system for parallel tasks. Implemented in pure Julia with a Tachikoma.jl-based dashboard and SQLite persistence.
- 📊 Tachikoma Dashboard: Clean 2-tab terminal UI for monitoring experiments
- 💾 SQLite Persistence: Current state stored in SQLite
- 🔄 Multi-Task Support: Track parallel sub-tasks within a single experiment
- ⚡ Simple API:
update!,finish!,fail! - 🔀 Distributed & Threads: Single DB writer on the master; workers get a
ProgressTaskviaget_task(manager, id, :remote)or:localand send updates over a channel
Package (for use in Julia scripts/REPL):
using Pkg
Pkg.add("MultiProgressManagers")App / CLI (optional, for the mpm dashboard from the shell):
using Pkg
Pkg.Apps.add("MultiProgressManagers")Ensure ~/.julia/bin is on your PATH. Then run mpm <folder> (the directory that contains your .db files) or mpm --help.
You can also open the dashboard from Julia without the app: view_dashboard(folder) (see Quick Start).
using MultiProgressManagers
# Create a multi-task experiment with 5 parallel tasks
N_tasks = 5
parameter_values = rand(N)
manager = ProgressManager(
"Training Run", N;
description = "Epoch 1-10 of ResNet training",
db_path = "./progresslogs/experiment1.db",
task_description = ["parameter=$(val)" for val in parameter_values],
)
# Update progress for each task as it runs
for (task_num, param_val) in enumerate(parameter_values)
steps = rand([100, 200, 300])
update!(manager;total_steps = steps, message="Starting run with $steps steps")
for step in 1:steps
do_work(task_num, param_val, step)
update!(manager, task_num; step = step, message = "step $step")
end
finish!(manager, task_num)
end
# Mark entire experiment as complete
finish!(manager)When work runs on other threads or processes, only the master touches the DB. Workers get a ProgressTask and send updates over a channel. Load Distributed before requesting :remote tasks so the remote-worker extension is available:
using MultiProgressManagers
using Distributed # for pmap
@everywhere using MultiProgressManagers
# Master: create experiment and get task handles
manager = ProgressManager("Distributed Run", 8; db_path = "./progresslogs/dist.db")
tasks = [get_task(manager, i, :remote) for i in 1:8] # :local for @spawn threads
# Workers: send progress via the task handle (no DB access)
@everywhere function run_worker(task::ProgressTask, total_steps::Int)
for step in 1:total_steps
do_work(step)
update!(task; step = step, total_steps = total_steps, message = "step $step")
end
finish!(task)
end
# Run and finish
pmap(i -> run_worker(tasks[i], 100), 1:8)
finish!(manager)See examples/multithreading.jl (threads + ProgressTask via get_task(..., :local)) and examples/distributed_pmap.jl (Distributed + ProgressTask via get_task(..., :remote)).
Add Drill.jl to your environment, load it with using Drill, then build a training callback from a progress task (often a :remote task for distributed workers):
using MultiProgressManagers
using Drill
using Distributed # when using :remote tasks
num_parallel_tasks = 8
manager = ProgressManager("my_study", num_parallel_tasks; db_path = default_db_path("my_study"))
task_index = 1
task = get_task(manager, task_index, :remote)
callback = create_drill_callback(task)From the shell: mpm ./progresslogs opens the dashboard for every .db file in that folder (requires the app; see Installation).
From Julia: using MultiProgressManagers; view_dashboard("./progresslogs") does the same.
Shows all experiments in the database:
- Experiment name and description
- Status (running/completed/failed)
- Overall progress across all tasks
- Start time and duration
- Automatically selects the newest experiment at the top of the list
Shows detailed view of selected experiment:
- Individual task progress bars
- Task status (pending/running/completed/failed)
- Current step / total steps for each task
- Message column (from
update!(...; message=...)) - Speed calculation (steps per second)
ProgressManager(name::String, num_tasks::Int;
description::String = "",
db_path::Union{String,Nothing} = nothing,
task_descriptions::Vector{String} = String[])Parameters:
name: Human-readable experiment name (shown in dashboard)num_tasks: Number of parallel sub-tasks in this experimentdescription: Optional longer descriptiondb_path: Optional path to SQLite database file. When omitted, the package derives a default path from the experiment name.task_descriptions: Optional vector of per-task labels (length must equalnum_tasks).
Returns: A ProgressManager instance for tracking this experiment.
update!(
manager::ProgressManager, task_number::Int;
step::Union{Int, Nothing} = nothing,
total_steps::Union{Int,Nothing} = nothing,
message::String = ""
)Records progress for a specific task. Report progress by supplying step, Update total steps for the task by using total_steps. Update the current task message using message. The message is shown in the dashboard Details tab. When total_steps is omitted, the previously stored total is reused. Its recommended to update total_steps at the beginning of a task. If total_steps is not supplied, it is updated automatically as step is updated.
Parameters:
manager: The ProgressManager for this experimenttask_number: 1-indexed task number (1 to total_tasks)step: Current progress step for this tasktotal_steps: Optional; set it once and later updates may omit itmessage: Optional; shown in the "Message" column (e.g. phase or status)
When work runs on other threads or processes, workers must not call update! or touch the DB. Instead they use a ProgressTask:
get_task(manager::ProgressManager, task_number::Int, type = :local) -> ProgressTaskReturns a handle for one task. type:
:local— plainChannel(same process,e.g.@spawnor@threads):remote—RemoteChannel(forDistributed/pmap)
update!(task::ProgressTask;
step::Union{Int,Nothing} = nothing
total_steps::Union{Int,Nothing} = nothing,
message::String = ""
)
finish!(task::ProgressTask)
fail!(task::ProgressTask; message::String = "Task failed")Workers call update! during the loop and finish!(task) when the task is done. A listener on the master process applies these to the DB. As with manager-side update!, total_steps only needs to be supplied when it changes or is first established.
create_drill_callback(task::ProgressTask)Returns a Drill callback that reports training progress through task. Load Drill.jl first with using Drill.
finish!(manager::ProgressManager, task_number::Int)Explicitly mark a task as completed. When doing multithreaded or distributed tasks, use finish!(task) on the ProgressTask instead.
finish!(manager::ProgressManager; message::String = "Completed successfully")Mark the entire experiment as completed. This sets the experiment status to "completed" and marks all remaining tasks as done. Optional message is stored as the experiment's final message.
fail!(manager::ProgressManager, task_number::Int; message::String = "Task failed")
fail!(manager::ProgressManager; message::String = "Experiment failed")Mark either a specific task or the entire experiment as failed with a message. The code also provides overloads that take an Exception or a positional error_message::String; these ultimately set the same message used in the DB.
Global
q/Q: Quitr: Refresh (reload data from database)
Tabs
1orF1: Runs tab2orF2: Details (Running) tab←/h: Previous tab→/l: Next tab
Runs tab
↑/↓orCtrl+k/Ctrl+j: Move selection; the highlighted experiment is the one shown in the Details tabf: Mark selected running experiment as failed (opens confirmation modal)
Confirmation modal (after pressing f)
Enter: Confirm (mark as failed) or cancel, depending on highlighted option←/hor→/l: Switch between Cancel and ConfirmEscapeorc: Cancel and close modal
Details (Running) tab
↑/↓orCtrl+k/Ctrl+j: Scroll the task lista/d: Shrink or grow the message column width
If you omit db_path, the package stores the experiment under the default progress-log directory using a filename derived from the experiment name. Because each experiment gets its own DB file, duplicate names at the default path are rejected; pass an explicit db_path if you want to reopen an existing experiment file.
# Example: Create database in specific directory
manager = ProgressManager("My Experiment", 3; db_path = "./logs/run1.db")MIT
Contributions welcome! Please open an issue to discuss changes before submitting PRs with new features.
