Madupite Options#

Madupite options are specified using MDP.setOption(). This function accepts either two strings, or a single string for boolean options. Numeric values should be passed as strings, e.g., "20", "0.1", "1e-10". Madupite options are built on top of PETSc options, allowing any PETSc option to be passed as well.

Required Options#

-mode <STRING>#

Specifies the optimization mode for the Markov Decision Process.

Accepted values: "MAXREWARD" or "MINCOST"

This option determines whether the algorithm will maximize rewards or minimize costs.

-discount_factor <DOUBLE>#

Sets the discount factor for future rewards or costs.

Value range: \((0, 1)\)

The discount factor determines the present value of future rewards. A value closer to 1 gives more weight to future rewards, while a value closer to 0 emphasizes immediate rewards.

Optional Options#

-max_iter_pi <INT>#

Specifies the maximum number of iterations for the inexact policy iteration algorithm.

Default: 1000

The algorithm will terminate after this many iterations, even if convergence has not been achieved. Must be a positive integer.

-max_iter_ksp <INT>#

Sets the maximum number of iterations for the Krylov subspace method.

Default: 1000

This option limits the iterations in the approximate policy evaluation step. The method will terminate after this many iterations, even without convergence. Must be a positive integer.

-atol_pi <DOUBLE>#

Defines the absolute tolerance for the inexact policy iteration algorithm.

Default: 1e-8

The algorithm terminates if the difference between the Bellman residual infinity norm is smaller than this value. Must be a positive double.

-alpha <DOUBLE>#

Sets the forcing sequence parameter for the approximate policy evaluation step.

Default: 1e-4

This parameter influences the accuracy of the policy evaluation step. Must be a positive double.

-file_stats <STRING>#

Specifies a file to write convergence and runtime information.

This option enables writing detailed statistics about the algorithm’s performance, useful for plotting and benchmarking.

-file_policy <STRING>#

Designates a file to write the optimal policy.

The optimal policy will be written in ASCII format, with entries separated by line breaks.

-file_cost <STRING>#

Specifies a file to write the optimal cost-to-go (or reward-to-go) function.

The function values will be written in ASCII format, with entries separated by line breaks.

-export_optimal_transition_probabilities <STRING>#

Defines a file to write the optimal transition probabilities matrix.

Exports the \(n \times n\)-matrix of optimal transition probabilities in ASCII and COO format. The file header contains num_rows, num_cols, num_nonzeros. Subsequent lines contain the row, column, and value of non-zero entries.

-export_optimal_stage_costs <STRING>#

Specifies a file to write the optimal stage costs (or rewards) vector.

Exports the \(n\)-dimensional vector of optimal stage costs (or rewards) in ASCII format, with entries separated by line breaks.

Useful PETSc Options#

-ksp_type <STRING>#

Selects the Krylov subspace method for the inner solver of inexact policy iteration.

Default: "gmres"

For a list of available algorithms, refer to the PETSc documentation: https://petsc.org/release/manualpages/KSP/KSPType/

-pc_type <STRING>#

Chooses the preconditioner to use before applying the inner solver.

Default: "none"

Only preconditioners that rely on the (transposed) matrix-vector product are supported. For the standard (exact) policy iteration algorithm, set this to “svd” (available only for sequential execution, not recommended for large-scale problems).

For a list of available preconditioners, see: https://petsc.org/release/manualpages/PC/PCType/

-log_view#

Enables output of a detailed algorithm log to the console.

This option is useful for debugging and benchmarking purposes.

Usage Example#

Command line usage:

./pendulum -discount_factor 0.999 -mode MINCOST -max_iter_pi 500

Using options file:

./pendulum -options options_file

Where options_file contains:

-discount_factor 0.999
-mode MINCOST
-max_iter_pi 500

Hard-coded options:

MDP mdp;
mdp.setOption("-discount_factor", "0.999");
mdp.setOption("-mode", "MINCOST");
mdp.setOption("-max_iter_pi", "500");

Warning

Python does not support command line options. Instead, they must be passed using the API:

mdp.setOption("-mode", "MINCOST")
mdp.setOption("-discount_factor", "0.999")

For more information on available KSP types and preconditioners, refer to the PETSc documentation: