Skip to content

05. Optimization Strategies

imi edited this page Aug 16, 2025 · 1 revision

Optimization Strategies

This guide covers the settings that control the optimization process itself. sd-optim uses Optuna as its primary, feature-rich optimization framework. Choosing the right strategy can significantly impact the efficiency and effectiveness of finding optimal merge parameters.

Optuna Dashboard

One of Optuna's most powerful features is its real-time dashboard. If launch_dashboard: True is set in your config.yaml, it will start automatically. You can access it at http://localhost:8080 (or your configured port).

The dashboard allows you to:

  • Monitor the best score over time.
  • See which trials are running, completed, or pruned.
  • View plots like parameter importance and slice plots as the study progresses.
  • Analyze relationships between parameters and scores.

Optuna Samplers

The sampler is the algorithm that chooses which parameter values to try in the next iteration. You configure it in config.yaml under optimizer.optuna_config.sampler.

  • tpe (Tree-structured Parzen Estimator): (Default & Recommended) A sophisticated algorithm that uses the history of past trials to predict promising new areas to explore. It provides an excellent balance between exploration and exploitation.
  • cmaes (Covariance Matrix Adaptation Evolution Strategy): A powerful evolutionary algorithm that works well for continuous, non-linear, and complex search spaces. It can be very effective but may require more init_points to get started.
  • qmc (Quasi-Monte Carlo): Uses a low-discrepancy sequence (like sobol or halton) to sample points more uniformly than pure random. Excellent for ensuring even coverage of the search space, especially in high dimensions.
  • random: Purely random sampling. Simple and effective for very short runs or as a baseline.
  • grid: Exhaustively tries all combinations of parameters defined in the search_space. Only practical for a very small number of parameters with few options.

Sampler-Specific Configuration

You can fine-tune the behavior of each sampler:

# Example for TPE (the default)
sampler:
  type: tpe
  multivariate: True # Can capture correlations between parameters.
  group: True        # Group parameters together for optimization.
# Example for QMC
sampler:
  type: qmc
  qmc_type: 'sobol' # Type of sequence: 'sobol', 'halton', or 'lhs'.
  scramble: True    # If True, scrambles the sequence points.
# Example for CMA-ES
sampler:
  type: cmaes
  restart_strategy: 'ipop' # Strategy for restarting: 'ipop', 'bipop', or null.
  sigma0: 0.1             # Initial step-size.

Early Stopping

This feature helps save time by automatically stopping an entire study if it's no longer making progress.

Note

Trial Pruning is not applicable in the current version. Pruning requires intermediate results within a single trial to stop it early. Since a trial's final score is an average of all generated images, there is no intermediate value to report to the pruner.

Future Possibilities: A future update might explore a custom pruning callback. For example, if the first image generated in a batch receives a very low score (e.g., 0.0), the system could be configured to abort the rest of the generations for that trial, saving significant time. This would allow the pruner to discard obviously broken merges early.

  • Early Stopping (early_stopping: True): This stops the entire study if no improvement has been seen for a certain number of trials.
    • patience: The number of trials to wait for an improvement before stopping.
    • min_improvement: The minimum score increase required to be considered an "improvement" and reset the patience counter.

General Guidelines

  • For most users: Stick with the default tpe sampler. It's robust and effective. Set init_points to 10-20 and n_iters to 20-50 for a solid run.
  • For complex merges with many parameters (>15): Consider using the qmc sampler with qmc_type: sobol to ensure the entire search space is explored evenly during the initial init_points phase.
  • For long runs (50+ iterations): Enabling early_stopping: True can prevent wasting time on a study that has already converged.
  • If you are refining a known good area: The cmaes sampler can be very efficient at finding the precise local optimum if you have a good starting point.

Bayesian Optimization Backend (Experimental)

sd-optim also includes a backend using the bayesian-optimization library. As noted in the README.md, this backend is currently untested. The following information is provided for reference.

Acquisition Functions (Bayesian Optimization Only)

The acquisition function guides the exploitation phase by deciding which point looks most promising to evaluate next.

  • ucb (Upper Confidence Bound): Encourages exploration by balancing the predicted score with the uncertainty of that prediction. Good for thorough searches.
  • ei (Expected Improvement): Offers a good balance between exploration and exploitation. Often a good default choice.
  • poi (Probability of Improvement): Tends to be exploitative, focusing heavily on areas already known to be good. Can get stuck in local optima.

Clone this wiki locally