-
Notifications
You must be signed in to change notification settings - Fork 0
05. Optimization Strategies
This guide covers the settings that control the optimization process itself. sd-optim uses Optuna as its primary, feature-rich optimization framework. Choosing the right strategy can significantly impact the efficiency and effectiveness of finding optimal merge parameters.
One of Optuna's most powerful features is its real-time dashboard. If launch_dashboard: True is set in your config.yaml, it will start automatically. You can access it at http://localhost:8080 (or your configured port).
The dashboard allows you to:
- Monitor the best score over time.
- See which trials are running, completed, or pruned.
- View plots like parameter importance and slice plots as the study progresses.
- Analyze relationships between parameters and scores.
The sampler is the algorithm that chooses which parameter values to try in the next iteration. You configure it in config.yaml under optimizer.optuna_config.sampler.
-
tpe(Tree-structured Parzen Estimator): (Default & Recommended) A sophisticated algorithm that uses the history of past trials to predict promising new areas to explore. It provides an excellent balance between exploration and exploitation. -
cmaes(Covariance Matrix Adaptation Evolution Strategy): A powerful evolutionary algorithm that works well for continuous, non-linear, and complex search spaces. It can be very effective but may require moreinit_pointsto get started. -
qmc(Quasi-Monte Carlo): Uses a low-discrepancy sequence (likesobolorhalton) to sample points more uniformly than pure random. Excellent for ensuring even coverage of the search space, especially in high dimensions. -
random: Purely random sampling. Simple and effective for very short runs or as a baseline. -
grid: Exhaustively tries all combinations of parameters defined in thesearch_space. Only practical for a very small number of parameters with few options.
You can fine-tune the behavior of each sampler:
# Example for TPE (the default)
sampler:
type: tpe
multivariate: True # Can capture correlations between parameters.
group: True # Group parameters together for optimization.# Example for QMC
sampler:
type: qmc
qmc_type: 'sobol' # Type of sequence: 'sobol', 'halton', or 'lhs'.
scramble: True # If True, scrambles the sequence points.# Example for CMA-ES
sampler:
type: cmaes
restart_strategy: 'ipop' # Strategy for restarting: 'ipop', 'bipop', or null.
sigma0: 0.1 # Initial step-size.This feature helps save time by automatically stopping an entire study if it's no longer making progress.
Note
Trial Pruning is not applicable in the current version. Pruning requires intermediate results within a single trial to stop it early. Since a trial's final score is an average of all generated images, there is no intermediate value to report to the pruner.
Future Possibilities: A future update might explore a custom pruning callback. For example, if the first image generated in a batch receives a very low score (e.g., 0.0), the system could be configured to abort the rest of the generations for that trial, saving significant time. This would allow the pruner to discard obviously broken merges early.
-
Early Stopping (
early_stopping: True): This stops the entire study if no improvement has been seen for a certain number of trials.-
patience: The number of trials to wait for an improvement before stopping. -
min_improvement: The minimum score increase required to be considered an "improvement" and reset the patience counter.
-
-
For most users: Stick with the default
tpesampler. It's robust and effective. Setinit_pointsto10-20andn_itersto20-50for a solid run. -
For complex merges with many parameters (>15): Consider using the
qmcsampler withqmc_type: sobolto ensure the entire search space is explored evenly during the initialinit_pointsphase. -
For long runs (50+ iterations): Enabling
early_stopping: Truecan prevent wasting time on a study that has already converged. -
If you are refining a known good area: The
cmaessampler can be very efficient at finding the precise local optimum if you have a good starting point.
sd-optim also includes a backend using the bayesian-optimization library. As noted in the README.md, this backend is currently untested. The following information is provided for reference.
The acquisition function guides the exploitation phase by deciding which point looks most promising to evaluate next.
-
ucb(Upper Confidence Bound): Encourages exploration by balancing the predicted score with the uncertainty of that prediction. Good for thorough searches. -
ei(Expected Improvement): Offers a good balance between exploration and exploitation. Often a good default choice. -
poi(Probability of Improvement): Tends to be exploitative, focusing heavily on areas already known to be good. Can get stuck in local optima.