Checkpointing
Checkpointing allows you to save the state of your quantum algorithm optimization at any point and resume from that exact state later. This is essential for long-running optimizations, debugging, and resuming interrupted computations.
Overview
Divi’s checkpointing system saves both the program state (parameters, losses, iteration count) and optimizer state (internal optimizer data) to disk. This allows you to:
Resume interrupted runs - Continue optimization from where it stopped
Debug optimization - Inspect intermediate states and parameters
Manage long runs - Break up very long optimizations into manageable chunks
Adjust iteration targets - Change max_iterations after loading to continue beyond the original target
Checkpointing is supported for all VariationalQuantumAlgorithm subclasses (VQE, QAOA) and works with checkpointing-capable optimizers:
MonteCarloOptimizerPymooOptimizer(CMAES and DE methods)
Note
ScipyOptimizer does not support checkpointing due to limitations in the underlying scipy optimization methods.
Basic Usage
Saving Checkpoints
To enable checkpointing, pass a CheckpointConfig object to the run() method:
from pathlib import Path
from divi.qprog import VQE, HartreeFockAnsatz
from divi.qprog.checkpointing import CheckpointConfig
from divi.backends import ParallelSimulator
import pennylane as qml
# Create a molecule
mol = qml.qchem.Molecule(
symbols=["H", "H"],
coordinates=np.array([[0.0, 0.0, -0.6614], [0.0, 0.0, 0.6614]])
)
# Create VQE program
vqe = VQE(
molecule=mol,
ansatz=HartreeFockAnsatz(),
n_layers=1,
max_iterations=10,
backend=ParallelSimulator(),
)
# Run with checkpointing enabled
checkpoint_dir = Path("my_checkpoints")
vqe.run(checkpoint_config=CheckpointConfig(checkpoint_dir=checkpoint_dir))
By default, checkpoints are saved every iteration. Each checkpoint is stored in a subdirectory named checkpoint_{iteration:03d} (e.g., checkpoint_001, checkpoint_002).
Checkpoint Interval
To save checkpoints less frequently, set the checkpoint_interval parameter:
# Save checkpoint every 5 iterations
vqe.run(
checkpoint_config=CheckpointConfig(
checkpoint_dir=checkpoint_dir,
checkpoint_interval=5
)
)
Auto-Generated Checkpoint Directories
You can automatically generate a timestamped checkpoint directory:
# Creates a directory like "checkpoint_20250115_143022"
config = CheckpointConfig.with_timestamped_dir()
vqe.run(checkpoint_config=config)
# Or with a checkpoint interval
config = CheckpointConfig.with_timestamped_dir(checkpoint_interval=5)
vqe.run(checkpoint_config=config)
Loading and Resuming
To resume from a checkpoint, use the load_state() class method:
from divi.qprog import VQE
# Load the latest checkpoint
vqe_resumed = VQE.load_state(
checkpoint_dir="my_checkpoints",
backend=ParallelSimulator(),
molecule=mol, # Must provide original problem configuration
ansatz=HartreeFockAnsatz(),
n_layers=1,
)
# Continue optimization
vqe_resumed.max_iterations = 20 # Set new target
vqe_resumed.run()
Important: When loading from a checkpoint, you must provide all the original constructor arguments (problem definition, ansatz, etc.) because checkpoints only store runtime state, not the problem configuration.
Loading Specific Checkpoints
By default, load_state() loads the latest checkpoint. To load a specific checkpoint:
# Load checkpoint from iteration 5
vqe_resumed = VQE.load_state(
checkpoint_dir="my_checkpoints",
backend=ParallelSimulator(),
subdirectory="checkpoint_005", # Specific checkpoint subdirectory
molecule=mol,
ansatz=HartreeFockAnsatz(),
n_layers=1,
)
Complete Example: QAOA with Checkpointing
Here’s a complete example showing checkpointing with QAOA:
import networkx as nx
from pathlib import Path
from divi.qprog import QAOA, GraphProblem
from divi.qprog.checkpointing import CheckpointConfig
from divi.qprog.optimizers import PymooOptimizer, PymooMethod
from divi.backends import ParallelSimulator
# Create problem
G = nx.bull_graph()
checkpoint_dir = Path("qaoa_checkpoints")
# Initial run - first half
qaoa1 = QAOA(
problem=G,
graph_problem=GraphProblem.MAX_CLIQUE,
n_layers=1,
optimizer=PymooOptimizer(method=PymooMethod.CMAES, population_size=10),
max_iterations=5,
backend=ParallelSimulator(),
)
# Run with checkpointing
qaoa1.run(checkpoint_config=CheckpointConfig(checkpoint_dir=checkpoint_dir))
# Later: Resume from checkpoint
qaoa2 = QAOA.load_state(
checkpoint_dir=checkpoint_dir,
backend=ParallelSimulator(),
problem=G, # Must provide original problem
graph_problem=GraphProblem.MAX_CLIQUE,
n_layers=1,
)
# Continue optimization
qaoa2.max_iterations = 10
qaoa2.run()
# Access results
print(f"Best loss: {qaoa2.best_loss}")
print(f"Solution: {qaoa2.solution}")
Managing Checkpoints
Listing Checkpoints
You can list all checkpoints in a directory:
from divi.qprog.checkpointing import list_checkpoints
checkpoints = list_checkpoints(Path("my_checkpoints"))
for checkpoint in checkpoints:
print(f"Iteration {checkpoint.iteration}: {checkpoint.path}")
print(f" Size: {checkpoint.size_bytes / 1024:.2f} KB")
print(f" Valid: {checkpoint.is_valid}")
Getting Checkpoint Information
Get detailed information about a specific checkpoint:
from divi.qprog.checkpointing import get_checkpoint_info
info = get_checkpoint_info(Path("my_checkpoints/checkpoint_005"))
print(f"Iteration: {info.iteration}")
print(f"Timestamp: {info.timestamp}")
print(f"Size: {info.size_bytes} bytes")
print(f"Valid: {info.is_valid}")
Finding the Latest Checkpoint
Get the path to the latest checkpoint:
from divi.qprog.checkpointing import get_latest_checkpoint
latest = get_latest_checkpoint(Path("my_checkpoints"))
if latest:
print(f"Latest checkpoint: {latest}")
Cleaning Up Old Checkpoints
Remove old checkpoints, keeping only the most recent N:
from divi.qprog.checkpointing import cleanup_old_checkpoints
# Keep only the 5 most recent checkpoints
cleanup_old_checkpoints(Path("my_checkpoints"), keep_last_n=5)
Checkpoint Structure
Each checkpoint is stored in a subdirectory with the following structure:
checkpoint_dir/
├── checkpoint_001/
│ ├── program_state.json # Program state (parameters, losses, etc.)
│ └── optimizer_state.json # Optimizer internal state
├── checkpoint_002/
│ ├── program_state.json
│ └── optimizer_state.json
└── ...
The program_state.json file contains:
Current iteration number
Loss history
Best parameters found so far
Current parameters
Random number generator state
Algorithm-specific state (e.g., eigenstate for VQE, solution nodes for QAOA)
The optimizer_state.json file contains optimizer-specific data:
For
MonteCarloOptimizer: Population, evaluated population, losses, RNG stateFor
PymooOptimizer: Serialized algorithm object and population
Best Practices
Use meaningful checkpoint directory names - Include experiment identifiers or timestamps
Set appropriate checkpoint intervals - For long runs, checkpoint every N iterations to save disk space
Always provide problem configuration when loading - Checkpoints don’t store problem definitions
Clean up old checkpoints - Use
cleanup_old_checkpoints()to manage disk spaceVerify checkpoint validity - Check
is_validbefore resuming from a checkpointUse auto-generated directories -
CheckpointConfig.with_timestamped_dir()prevents accidental overwrites
Error Handling
Checkpointing operations can raise several exceptions:
CheckpointNotFoundError- Checkpoint directory or file not foundCheckpointCorruptedError- Checkpoint file is invalid or corruptedRuntimeError- Attempting to save checkpoint before any iterations completeValueError- Invalid checkpoint configuration
Always handle these exceptions appropriately:
from divi.qprog.checkpointing import (
CheckpointNotFoundError,
CheckpointCorruptedError,
)
try:
vqe = VQE.load_state(checkpoint_dir, backend=backend, molecule=mol)
except CheckpointNotFoundError as e:
print(f"Checkpoint not found: {e}")
except CheckpointCorruptedError as e:
print(f"Checkpoint corrupted: {e}")
Limitations
ScipyOptimizer does not support checkpointing
Checkpoints are not portable across different Python versions or library versions
Problem configuration must be manually provided when loading (not stored in checkpoint)
Checkpoint files can be large for population-based optimizers (MonteCarlo, Pymoo)