diff --git a/.claude/skills/generate-scripts/references/aposmm.md b/.claude/skills/generate-scripts/references/aposmm.md index 892007b87a..bac7c2d8c3 100644 --- a/.claude/skills/generate-scripts/references/aposmm.md +++ b/.claude/skills/generate-scripts/references/aposmm.md @@ -62,10 +62,6 @@ When using a SciPy method, must also supply `opt_return_codes` — e.g. [0] for | `lhs_divisions` | int | Latin hypercube partitions (0 or 1 = uniform) | | `rk_const` | float | Multiplier for r_k value | -## Worker Configuration - -With `gen_on_manager=True`, the persistent generator runs on the manager process and all `nworkers` are available for simulations. - ## Local Optimizer Methods ### SciPy (no extra install) diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index dc82efef8e..69c11918b9 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -37,4 +37,4 @@ repos: rev: v1.19.1 hooks: - id: mypy - exclude: ^libensemble/utils/(launcher|loc_stack|runners|pydantic|output_directory)\.py$|libensemble/tests/(regression_tests|functionality_tests|unit_tests|scaling_tests)/.* + exclude: ^docs/conf\.py$|libensemble/utils/(launcher|loc_stack|runners|pydantic|output_directory)\.py$|libensemble/tests/(regression_tests|functionality_tests|unit_tests|scaling_tests)/.* diff --git a/AGENTS.md b/AGENTS.md index 75086f46c6..f5673f64a7 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -43,10 +43,10 @@ Information about Generators Its fields match ``sim_specs/gen_specs["out"]`` or ``vocs`` attributes, plus additional reserved fields for metadata. - Prior to libEnsemble v1.6.0, generators were plain functions. They often ran in "persistent" mode, meaning they executed in a long-running loop, sending and receiving points to and from the manager until the ensemble was complete. -- A ``gest-api`` or "standardized" generator is a class that at a minimum implements ``suggest`` and ``ingest`` methods, and is parameterized by a ``vocs``. -- See ``libensemble/generators.py`` for more information about the ``gest-api`` standard. +- A ``gest-api`` or "standardized" generator is a class that inherits from ``gest_api.Generator``, implements ``suggest`` and ``ingest`` methods (which process lists of dictionaries, not NumPy arrays), and is parameterized by a ``vocs``. +- See ``libensemble/gen_classes/external/sampling.py`` for simple examples of the pure ``gest-api`` interface. (Note: ``libensemble.generators.LibensembleGenerator`` exists to wrap legacy NumPy-based workflows, but pure ``gest_api.Generator`` is preferred). - Generators are often used for simple sampling, optimization, calibration, uncertainty quantification, and other simulation-based tasks. -- **Automatic Variable Mapping**: Subclasses of ``LibensembleGenerator`` (like ``UniformSample``) automatically map all ``VOCS`` variables to a single multi-dimensional ``"x"`` field in the History array if no explicit ``variables_mapping`` is provided. +- **Automatic Variable Mapping**: When using ``LibensembleGenerator`` subclasses, they automatically map all ``VOCS`` variables to a single multi-dimensional ``"x"`` field in the History array if no explicit ``variables_mapping`` is provided. Pure ``gest_api.Generator`` classes handle variables natively. - **Mandatory Input Fields**: Even for simple generators that don't ingest data, ``gen_specs["in"]`` or ``gen_specs["persis_in"]`` must be defined if using an allocation function like ``only_persistent_gens`` that attempts to send rows. If these are empty, the manager will raise an ``AssertionError`` stating that no fields were requested to be sent. - **Default Allocator**: ``only_persistent_gens`` is the default allocator for standardized ``gest-api`` generators. It treats these generators as persistent entities that communicate throughout the run. @@ -97,3 +97,5 @@ When modernizing existing libEnsemble scripts (functionality tests, regression t - **Remove Explicit `AllocSpecs`**: In libEnsemble 2.0, `only_persistent_gens` is the default allocator. Scripts that previously used `give_sim_work_first` or other simple allocators can often remove `alloc_specs` entirely when switching to standardized generators. - **Generator Placement**: By default, generators run on the manager thread (Worker 0). This means all allocated workers are available for simulation tasks unless `gen_on_worker` is explicitly set to `True` in `libE_specs`. - **Mandatory Fields**: Ensure `gen_specs["in"]` or `gen_specs["persis_in"]` includes at least one field (e.g., `["sim_id"]`) if feedback is sent back to the generator, to satisfy the allocator's requirements. +- **gest-api Simulators**: The gest-api pattern also applies to simulators. Set `SimSpecs.simulator` to a callable with signature `(input_dict: dict, **kwargs) -> dict` instead of providing a `sim_f`. libEnsemble automatically wraps it with `gest_api_sim` from `libensemble.sim_funcs.gest_api_wrapper` and handles all NumPy conversions. `SimSpecs.inputs` and `SimSpecs.outputs` can be derived automatically when `SimSpecs.vocs` is provided. +- **`safe_mode` is opt-in**: `libE_specs["safe_mode"]` defaults to `False`, meaning protected History fields (`gen_worker`, `gen_started_time`, `gen_ended_time`, `sim_worker`, `sim_started`, `sim_started_time`, `sim_ended`, `sim_ended_time`, `gen_informed`, `gen_informed_time`, `kill_sent`) are freely overwritable by default. Set `safe_mode=True` to enable protection. Overwriting these fields without understanding their purpose may crash libEnsemble. diff --git a/README.rst b/README.rst index 5f4935f941..06bc0d94b0 100644 --- a/README.rst +++ b/README.rst @@ -41,39 +41,49 @@ Basic Usage =========== Create an ``Ensemble``, then customize it with general settings, simulation and generator parameters, -and an exit condition. Run the following four-worker example via ``python this_file.py``: +and an exit condition. .. code-block:: python import numpy as np + from gest_api.vocs import VOCS from libensemble import Ensemble - from libensemble.gen_funcs.sampling import uniform_random_sample + from libensemble.gen_classes.sampling import UniformSample from libensemble.sim_funcs.six_hump_camel import six_hump_camel from libensemble.specs import ExitCriteria, GenSpecs, LibeSpecs, SimSpecs if __name__ == "__main__": + # Define problem using VOCS + vocs = VOCS( + variables={"x": [-3, 3], "y": [-2, 2]}, + objectives={"f": "MINIMIZE"}, + ) + # General settings libE_specs = LibeSpecs(nworkers=4) + # Simulation parameters sim_specs = SimSpecs( sim_f=six_hump_camel, inputs=["x"], outputs=[("f", float)], ) + # Generator parameters (standardized generator) gen_specs = GenSpecs( - gen_f=uniform_random_sample, + generator=UniformSample(vocs), + inputs=["sim_id"], + persis_in=["x", "f"], outputs=[("x", float, 2)], - user={ - "gen_batch_size": 50, - "lb": np.array([-3, -2]), - "ub": np.array([3, 2]), - }, + vocs=vocs, + user={"gen_batch_size": 50}, ) + # Exit criteria exit_criteria = ExitCriteria(sim_max=100) + # Create and run ensemble sampling = Ensemble( libE_specs=libE_specs, sim_specs=sim_specs, diff --git a/SUPPORT.rst b/SUPPORT.rst index e86b6a1e6a..bf6f68d9cb 100644 --- a/SUPPORT.rst +++ b/SUPPORT.rst @@ -1,6 +1,10 @@ Support ------- +Open issues on Github at: + +* https://github.com/Libensemble/libensemble/issues + Join the libEnsemble mailing list at: * https://lists.mcs.anl.gov/mailman/listinfo/libensemble diff --git a/docs/_static/libE_logo.png b/docs/_static/libE_logo.png new file mode 100755 index 0000000000..17f051faab Binary files /dev/null and b/docs/_static/libE_logo.png differ diff --git a/docs/_static/libE_logo_white.png b/docs/_static/libE_logo_white.png new file mode 100644 index 0000000000..220de8766e Binary files /dev/null and b/docs/_static/libE_logo_white.png differ diff --git a/docs/advanced_installation.rst b/docs/advanced_installation.rst deleted file mode 100644 index 060435b564..0000000000 --- a/docs/advanced_installation.rst +++ /dev/null @@ -1,199 +0,0 @@ -Advanced Installation -===================== - -libEnsemble can be installed from ``pip``, ``uv``, ``Conda``, or ``Spack``. - -libEnsemble requires the following dependencies, which are typically -automatically installed alongside libEnsemble: - -* Python_ ``>= 3.11`` -* NumPy_ ``>= 1.21`` -* psutil_ ``>= 5.9.4`` -* `pydantic`_ ``>= 2`` -* pyyaml_ ``>= v6.0`` -* tomli_ ``>= 1.2.1`` -* gest-api_ ``>= 0.1,<0.2`` - -We recommend installing in a virtual environment from ``uv``, ``conda`` or another source. - -Further recommendations for selected HPC systems are given in the -:ref:`HPC platform guides`. - -.. tab-set:: - - .. tab-item:: pip - - To install the latest PyPI_ release:: - - pip install libensemble - - To pip install libEnsemble from the latest develop branch:: - - python -m pip install --upgrade git+https://github.com/Libensemble/libensemble.git@develop - - **Installing with mpi4py** - - If you wish to use ``mpi4py`` with libEnsemble (choosing MPI out of the three - :doc:`communications options`), then this should - be installed to work with the existing MPI on your system. For example, - the following line:: - - pip install mpi4py - - will use the ``mpicc`` compiler wrapper on your PATH to identify the MPI library. - To specify a different compiler wrapper, add the ``MPICC`` option. - You also may wish to avoid existing binary builds; for example,:: - - MPICC=mpiicc pip install mpi4py --no-binary mpi4py - - On Summit, the following line is recommended (with gcc compilers):: - - CC=mpicc MPICC=mpicc pip install mpi4py --no-binary mpi4py - - .. tab-item:: uv - - To install the latest PyPI_ release via uv_:: - - uv pip install libensemble - - .. tab-item:: conda - - Install libEnsemble with Conda_ from the conda-forge channel:: - - conda config --add channels conda-forge - conda install -c conda-forge libensemble - - This package comes with some useful optional dependencies, including - optimizers and will install quickly as ready binary packages. - - **Installing with mpi4py with Conda** - - If you wish to use ``mpi4py`` with libEnsemble (choosing MPI out of the three - :doc:`communications options`), you can use the - following. - - .. note:: - For clusters and HPC systems, always install ``mpi4py`` to use the - system MPI library (see pip instructions above). - - For a standalone build that comes with an MPI implementation, you can install - libEnsemble using one of the following variants. - - To install libEnsemble with MPICH_:: - - conda install -c conda-forge libensemble=*=mpi_mpich* - - To install libEnsemble with `Open MPI`_:: - - conda install -c conda-forge libensemble=*=mpi_openmpi* - - The asterisks will pick up the latest version and build. - - .. note:: - This syntax may not work without adjustments on macOS or any non-bash - shell. In these cases, try:: - - conda install -c conda-forge libensemble='*'=mpi_mpich'*' - - For a complete list of builds for libEnsemble on Conda:: - - conda search libensemble --channel conda-forge - - .. tab-item:: Spack - - Install libEnsemble using the Spack_ distribution:: - - spack install py-libensemble - - The above command will install the latest release of libEnsemble with - the required dependencies only. Other optional - dependencies can be specified through variants. The following - line installs libEnsemble version 0.7.2 with some common variants - (e.g., using :doc:`APOSMM<../examples/aposmm>`): - - .. code-block:: bash - - spack install py-libensemble @0.7.2 +mpi +scipy +mpmath +petsc4py +nlopt - - The list of variants can be found by running:: - - spack info py-libensemble - - On some platforms you may wish to run libEnsemble without ``mpi4py``, - using a serial PETSc build. This is often preferable if running on - the launch nodes of a three-tier system (e.g., Summit):: - - spack install py-libensemble +scipy +mpmath +petsc4py ^py-petsc4py~mpi ^petsc~mpi~hdf5~hypre~superlu-dist - - The installation will create modules for libEnsemble and the dependent - packages. These can be loaded by running:: - - spack load -r py-libensemble - - Any Python packages will be added to the PYTHONPATH when the modules are loaded. If you do not have - modules on your system you may need to install ``lmod`` (also available in Spack):: - - spack install lmod - . $(spack location -i lmod)/lmod/lmod/init/bash - spack load lmod - - Alternatively, Spack could be used to build the serial ``petsc4py``, and Conda could use this by loading - the ``py-petsc4py`` module thus created. - - **Hint**: When combining Spack and Conda, you can access your Conda Python and packages in your - ``~/.spack/packages.yaml`` while your Conda environment is activated, using ``CONDA_PREFIX`` - For example, if you have an activated Conda environment with Python 3.11 and SciPy installed: - - .. code-block:: yaml - - packages: - python: - externals: - - spec: "python" - prefix: $CONDA_PREFIX - buildable: False - py-numpy: - externals: - - spec: "py-numpy" - prefix: $CONDA_PREFIX/lib/python3.11/site-packages/numpy - buildable: False - py-scipy: - externals: - - spec: "py-scipy" - prefix: $CONDA_PREFIX/lib/python3.11/site-packages/scipy - buildable: True - - For more information on Spack builds and any particular considerations - for specific systems, see the spack_libe_ repository. In particular, this - includes some example ``packages.yaml`` files (which go in ``~/.spack/``). - These files are used to specify dependencies that Spack must obtain from - the given system (rather than building from scratch). This may include - ``Python`` and the packages distributed with it (e.g., ``numpy``), and will - often include the system MPI library. - -Optional Dependencies for Additional Features ---------------------------------------------- - -The following packages may be installed separately to enable additional features: - -* pyyaml_ and tomli_ - Parameterize libEnsemble via yaml or toml -* `Globus Compute`_ - Submit simulation or generator function instances to remote Globus Compute endpoints - -.. _conda-forge: https://conda-forge.org/ -.. _Conda: https://docs.conda.io/en/latest/ -.. _gest-api: https://github.com/campa-consortium/gest-api -.. _GitHub: https://github.com/Libensemble/libensemble -.. _Globus Compute: https://www.globus.org/compute -.. _MPICH: https://www.mpich.org/ -.. _NumPy: http://www.numpy.org -.. _Open MPI: https://www.open-mpi.org/ -.. _psutil: https://pypi.org/project/psutil/ -.. _pydantic: https://docs.pydantic.dev/1.10/ -.. _PyPI: https://pypi.org -.. _Python: http://www.python.org -.. _pyyaml: https://pyyaml.org/ -.. _Spack: https://spack.readthedocs.io/en/latest -.. _spack_libe: https://github.com/Libensemble/spack_libe -.. _tomli: https://pypi.org/project/tomli/ -.. _tqdm: https://tqdm.github.io/ -.. _uv: https://docs.astral.sh/uv/ diff --git a/docs/advanced_installation/advanced_installation.rst b/docs/advanced_installation/advanced_installation.rst new file mode 100644 index 0000000000..fc2e8546a4 --- /dev/null +++ b/docs/advanced_installation/advanced_installation.rst @@ -0,0 +1,41 @@ +Advanced Installation +===================== + +`pip `__ \|\| `uv `__ \|\| `pixi `__ \|\| `conda `__ \|\| `Spack `__ + +libEnsemble can be installed from ``pip``, ``uv``, ``pixi``, ``Conda``, or ``Spack``. + +libEnsemble requires the following dependencies, which are typically +automatically installed alongside libEnsemble: + +* Python_ ``>= 3.11`` +* NumPy_ ``>= 1.21`` +* psutil_ ``>= 5.9.4`` +* `pydantic`_ ``>= 2`` +* gest-api_ ``>= 0.1,<0.2`` + +We recommend installing in a virtual environment from ``uv``, ``conda`` or another source. + +Further recommendations for selected HPC systems are given in the +:ref:`HPC platform guides`. + +.. toctree:: + :hidden: + + advanced_installation_pip + advanced_installation_uv + advanced_installation_pixi + advanced_installation_conda + advanced_installation_spack + +Globus Compute +-------------- + +`Globus Compute`_ may be installed optionally to submit simulation function instances to remote Globus Compute endpoints. + +.. _Globus Compute: https://www.globus.org/compute +.. _Python: http://www.python.org +.. _NumPy: http://www.numpy.org +.. _psutil: https://pypi.org/project/psutil/ +.. _pydantic: https://docs.pydantic.dev/1.10/ +.. _gest-api: https://github.com/campa-consortium/gest-api diff --git a/docs/advanced_installation/advanced_installation_conda.rst b/docs/advanced_installation/advanced_installation_conda.rst new file mode 100644 index 0000000000..c34ce25b1a --- /dev/null +++ b/docs/advanced_installation/advanced_installation_conda.rst @@ -0,0 +1,49 @@ +conda +===== + +`Advanced Installation `__ \|\| `pip `__ \|\| `uv `__ \|\| `pixi `__ \|\| **conda** \|\| `Spack `__ + +Install libEnsemble with Conda_ from the conda-forge channel:: + + conda config --add channels conda-forge + conda install -c conda-forge libensemble + +This package comes with some useful optional dependencies, including +optimizers and will install quickly as ready binary packages. + +**Installing with mpi4py with Conda** + +If you wish to use ``mpi4py`` with libEnsemble (choosing MPI out of the three +:doc:`communications options<../running_libE>`), you can use the +following. + +.. note:: + For clusters and HPC systems, always install ``mpi4py`` to use the + system MPI library (see pip instructions above). + +For a standalone build that comes with an MPI implementation, you can install +libEnsemble using one of the following variants. + +To install libEnsemble with MPICH_:: + + conda install -c conda-forge libensemble=*=mpi_mpich* + +To install libEnsemble with `Open MPI`_:: + + conda install -c conda-forge libensemble=*=mpi_openmpi* + +The asterisks will pick up the latest version and build. + +.. note:: + This syntax may not work without adjustments on macOS or any non-bash + shell. In these cases, try:: + + conda install -c conda-forge libensemble='*'=mpi_mpich'*' + +For a complete list of builds for libEnsemble on Conda:: + + conda search libensemble --channel conda-forge + +.. _Conda: https://docs.conda.io/en/latest/ +.. _MPICH: https://www.mpich.org/ +.. _Open MPI: https://www.open-mpi.org/ diff --git a/docs/advanced_installation/advanced_installation_pip.rst b/docs/advanced_installation/advanced_installation_pip.rst new file mode 100644 index 0000000000..9416765b1c --- /dev/null +++ b/docs/advanced_installation/advanced_installation_pip.rst @@ -0,0 +1,29 @@ +pip +=== + +`Advanced Installation `__ \|\| **pip** \|\| `uv `__ \|\| `pixi `__ \|\| `conda `__ \|\| `Spack `__ + +To install the latest PyPI_ release:: + + pip install libensemble + +To pip install libEnsemble from the latest develop branch:: + + python -m pip install --upgrade git+https://github.com/Libensemble/libensemble.git@develop + +**Installing with mpi4py** + +If you wish to use ``mpi4py`` with libEnsemble (choosing MPI out of the three +:doc:`communications options<../running_libE>`), then this should +be installed to work with the existing MPI on your system. For example, +the following line:: + + pip install mpi4py + +will use the ``mpicc`` compiler wrapper on your PATH to identify the MPI library. +To specify a different compiler wrapper, add the ``MPICC`` option. +You also may wish to avoid existing binary builds; for example,:: + + MPICC=mpiicc pip install mpi4py --no-binary mpi4py + +.. _PyPI: https://pypi.org diff --git a/docs/advanced_installation/advanced_installation_pixi.rst b/docs/advanced_installation/advanced_installation_pixi.rst new file mode 100644 index 0000000000..8227fcbd87 --- /dev/null +++ b/docs/advanced_installation/advanced_installation_pixi.rst @@ -0,0 +1,20 @@ +pixi +==== + +`Advanced Installation `__ \|\| `pip `__ \|\| `uv `__ \|\| **pixi** \|\| `conda `__ \|\| `Spack `__ + +Add to your pixi_ environment:: + + pixi add libensemble + +libEnsemble is also distributed with locked pixi environments for different versions of Python +and various dependency sets, primarily for testing but also useful for guaranteed working environments. +See a list with:: + + pixi workspace environment list + +and activate with:: + + pixi shell -e + +.. _pixi: https://pixi.prefix.dev/latest/ diff --git a/docs/advanced_installation/advanced_installation_spack.rst b/docs/advanced_installation/advanced_installation_spack.rst new file mode 100644 index 0000000000..3e9b1132e3 --- /dev/null +++ b/docs/advanced_installation/advanced_installation_spack.rst @@ -0,0 +1,77 @@ +Spack +===== + +`Advanced Installation `__ \|\| `pip `__ \|\| `uv `__ \|\| `pixi `__ \|\| `conda `__ \|\| **Spack** + +Install libEnsemble using the Spack_ distribution:: + + spack install py-libensemble + +The above command will install the latest release of libEnsemble with +the required dependencies only. Other optional +dependencies can be specified through variants. The following +line installs libEnsemble version 1.5.0 with some common variants +(e.g., using :doc:`APOSMM<../examples/gest_api/aposmm>`): + +.. code-block:: bash + + spack install py-libensemble @1.5.0 +mpi +scipy +mpmath +petsc4py +nlopt + +The list of variants can be found by running:: + + spack info py-libensemble + +On some platforms you may wish to run libEnsemble without ``mpi4py``, +using a serial PETSc build. This is often preferable if running on +the launch nodes of a three-tier system:: + + spack install py-libensemble +scipy +mpmath +petsc4py ^py-petsc4py~mpi ^petsc~mpi~hdf5~hypre~superlu-dist + +The installation will create modules for libEnsemble and the dependent +packages. These can be loaded by running:: + + spack load -r py-libensemble + +Any Python packages will be added to the PYTHONPATH when the modules are loaded. If you do not have +modules on your system you may need to install ``lmod`` (also available in Spack):: + + spack install lmod + . $(spack location -i lmod)/lmod/lmod/init/bash + spack load lmod + +Alternatively, Spack could be used to build the serial ``petsc4py``, and Conda could use this by loading +the ``py-petsc4py`` module thus created. + +**Hint**: When combining Spack and Conda, you can access your Conda Python and packages in your +``~/.spack/packages.yaml`` while your Conda environment is activated, using ``CONDA_PREFIX`` +For example, if you have an activated Conda environment with Python 3.11 and SciPy installed: + +.. code-block:: yaml + + packages: + python: + externals: + - spec: "python" + prefix: $CONDA_PREFIX + buildable: False + py-numpy: + externals: + - spec: "py-numpy" + prefix: $CONDA_PREFIX/lib/python3.11/site-packages/numpy + buildable: False + py-scipy: + externals: + - spec: "py-scipy" + prefix: $CONDA_PREFIX/lib/python3.11/site-packages/scipy + buildable: True + +For more information on Spack builds and any particular considerations +for specific systems, see the spack_libe_ repository. In particular, this +includes some example ``packages.yaml`` files (which go in ``~/.spack/``). +These files are used to specify dependencies that Spack must obtain from +the given system (rather than building from scratch). This may include +``Python`` and the packages distributed with it (e.g., ``numpy``), and will +often include the system MPI library. + +.. _Spack: https://spack.readthedocs.io/en/latest +.. _spack_libe: https://github.com/Libensemble/spack_libe diff --git a/docs/advanced_installation/advanced_installation_uv.rst b/docs/advanced_installation/advanced_installation_uv.rst new file mode 100644 index 0000000000..b10b64bfa5 --- /dev/null +++ b/docs/advanced_installation/advanced_installation_uv.rst @@ -0,0 +1,11 @@ +uv +== + +`Advanced Installation `__ \|\| `pip `__ \|\| **uv** \|\| `pixi `__ \|\| `conda `__ \|\| `Spack `__ + +To install the latest PyPI_ release via uv_:: + + uv pip install libensemble + +.. _PyPI: https://pypi.org +.. _uv: https://docs.astral.sh/uv/ diff --git a/docs/conf.py b/docs/conf.py index 64349a4fe0..ab82bc4292 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -59,11 +59,6 @@ class AxParameterWarning(Warning): # Ensure it's a real warning subclass sys.modules["ax.exceptions.core"] = MagicMock() sys.modules["ax.exceptions.core"].AxParameterWarning = AxParameterWarning -# from libensemble import * -# from libensemble.alloc_funcs import * -# from libensemble.gen_funcs import * -# from libensemble.sim_funcs import * - # sys.path.insert(0, os.path.abspath('.')) sys.path.append(os.path.abspath("../libensemble")) @@ -112,13 +107,7 @@ class AxParameterWarning(Warning): # Ensure it's a real warning subclass bibtex_bibfiles = ["references.bib"] bibtex_default_style = "unsrt" -# autosectionlabel_prefix_document = True -# extensions = ['sphinx.ext.autodoc', 'sphinx.ext.napoleon', 'sphinx.ext.imgconverter'] -# breathe_projects = { "libEnsemble": "../code/src/xml/" } -# breathe_default_project = "libEnsemble" -##breathe_projects_source = {"libEnsemble" : ( "../code/src/", ["libE.py", "test.cpp"] )} -# breathe_projects_source = {"libEnsemble" : ( "../code/src/", ["test.cpp","test2.cpp"] )} autodoc_member_order = "bysource" model_show_field_summary = "bysource" @@ -185,6 +174,7 @@ class AxParameterWarning(Warning): # Ensure it's a real warning subclass # The name of the Pygments (syntax highlighting) style to use. pygments_style = "sphinx" +pygments_dark_style = "monokai" # If true, `todo` and `todoList` produce output, else they produce nothing. todo_include_todos = False @@ -210,9 +200,9 @@ class AxParameterWarning(Warning): # Ensure it's a real warning subclass # html_theme = 'sphinxdoc' # html_theme = "sphinx_book_theme" -html_theme = "sphinx_rtd_theme" +html_theme = "furo" -html_logo = "./images/libE_logo_white.png" +# html_logo = "./images/libE_logo_white.png" html_favicon = "./images/libE_logo_circle.png" html_title = "libEnsemble" @@ -221,7 +211,12 @@ class AxParameterWarning(Warning): # Ensure it's a real warning subclass # documentation. # html_theme_options = { - "logo_only": True, + "announcement": "libEnsemble v2.0 is released, with many new features and changes.", + "source_repository": "https://github.com/Libensemble/libensemble/", + "source_branch": "main", + "source_directory": "docs/", + "light_logo": "libE_logo.png", + "dark_logo": "libE_logo_white.png", } # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, @@ -240,22 +235,6 @@ def setup(app): app.connect("autodoc-process-docstring", remove_noqa) -# Custom sidebar templates, must be a dictionary that maps document names -# to template names. -# -# This is required for the alabaster theme -# refs: http://alabaster.readthedocs.io/en/latest/installation.html#sidebars -# html_sidebars = { -# '**': [ -# 'about.html', -# 'navigation.html', -# 'relations.html', # needs 'show_related': True theme option to display -# 'searchbox.html', -# 'donate.html', -# ] -# } - - # -- Options for HTMLHelp output ------------------------------------------ # Output file base name for HTML help builder. diff --git a/docs/data_structures/alloc_specs.rst b/docs/data_structures/alloc_specs.rst index 159b9eacaf..f29c0e5a2d 100644 --- a/docs/data_structures/alloc_specs.rst +++ b/docs/data_structures/alloc_specs.rst @@ -19,7 +19,7 @@ Can be constructed and passed to libEnsemble as a Python class or a dictionary. * libEnsemble uses the following defaults if the user doesn't provide their own ``alloc_specs``: .. literalinclude:: ../../libensemble/specs.py - :start-at: alloc_f: Callable = start_only_persistent + :start-at: alloc_f: object = only_persistent_gens :end-before: end_alloc_tag :caption: Default settings for alloc_specs @@ -31,14 +31,4 @@ Can be constructed and passed to libEnsemble as a Python class or a dictionary. my_new_alloc = AllocSpecs() my_new_alloc.alloc_f = another_function -.. seealso:: - - `test_uniform_sampling_one_residual_at_a_time.py`_ specifies fields - to be used by the allocation function ``give_sim_work_first`` from - fast_alloc_and_pausing.py_. - - .. literalinclude:: ../../libensemble/tests/functionality_tests/test_uniform_sampling_one_residual_at_a_time.py - :start-at: alloc_specs - :end-before: end_alloc_specs_rst_tag - -.. _fast_alloc_and_pausing.py: https://github.com/Libensemble/libensemble/blob/develop/libensemble/alloc_funcs/fast_alloc_and_pausing.py .. _test_uniform_sampling_one_residual_at_a_time.py: https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_uniform_sampling_one_residual_at_a_time.py diff --git a/docs/data_structures/data_structures.rst b/docs/data_structures/data_structures.rst index 35a5ba0158..a5a71862e8 100644 --- a/docs/data_structures/data_structures.rst +++ b/docs/data_structures/data_structures.rst @@ -8,10 +8,10 @@ See :ref:`here` for instruction on constructing a complete workflow :maxdepth: 2 :caption: libEnsemble Specifications: - libE_specs + libE_specs/libE_specs gen_specs sim_specs + exit_criteria alloc_specs platform_specs persis_info - exit_criteria diff --git a/docs/data_structures/gen_specs.rst b/docs/data_structures/gen_specs.rst index b3364e53f7..e95950731e 100644 --- a/docs/data_structures/gen_specs.rst +++ b/docs/data_structures/gen_specs.rst @@ -5,16 +5,37 @@ Generator Specs Used to specify the generator, its inputs and outputs, and user data. +Standardized (gest-api) +----------------------- + .. code-block:: python :linenos: + from libensemble import GenSpecs + from libensemble.gen_classes import UniformSample + from gest_api.vocs import VOCS + + vocs = VOCS( + variables={"x": [-3.0, 3.0]}, + objectives={"y": "MINIMIZE"}, + ) + + gen_specs = GenSpecs( + generator=UniformSample(vocs), + vocs=vocs, + ) ... + +Classic (gen_f) +--------------- + +.. code-block:: python + :linenos: + import numpy as np from libensemble import GenSpecs from generator import gen_random_sample - ... - gen_specs = GenSpecs( gen_f=gen_random_sample, outputs=[("x", float, (1,))], diff --git a/docs/data_structures/libE_specs.rst b/docs/data_structures/libE_specs.rst deleted file mode 100644 index 7afe0c5bb5..0000000000 --- a/docs/data_structures/libE_specs.rst +++ /dev/null @@ -1,361 +0,0 @@ -.. _datastruct-libe-specs: - -LibE Specs -========== - -libEnsemble is primarily customized by setting options within a ``LibeSpecs`` instance. - -.. code-block:: python - - from libensemble.specs import LibeSpecs - - specs = LibeSpecs(save_every_k_gens=100, sim_dirs_make=True, nworkers=4) - -.. dropdown:: Settings by Category - :open: - - .. tab-set:: - - .. tab-item:: General - - **comms** [str] = ``"mpi"``: - Manager/Worker communications mode: ``'mpi'``, ``'local'``, or ``'tcp'``. - If ``nworkers`` is specified, then ``local`` comms will be used unless a - parallel MPI environment is detected. - - **nworkers** [int]: - Number of worker processes in ``"local"``, ``"threads"``, or ``"tcp"``. - - **gen_on_worker** [bool] = False - Instructs Worker process to run generator instead of Manager. - - **mpi_comm** [MPI communicator] = ``MPI.COMM_WORLD``: - libEnsemble MPI communicator. - - **dry_run** [bool] = ``False``: - Whether libEnsemble should immediately exit after validating all inputs. - - **abort_on_exception** [bool] = ``True``: - In MPI mode, whether to call ``MPI_ABORT`` on an exception. - If ``False``, an exception will be raised by the manager. - - **worker_timeout** [int] = ``1``: - On libEnsemble shutdown, number of seconds after which workers considered timed out, - then terminated. - - **kill_canceled_sims** [bool] = ``False``: - Try to kill sims with ``cancel_requested`` set to ``True``. - If ``False``, the manager avoids this moderate overhead. - - **disable_log_files** [bool] = ``False``: - Disable ``ensemble.log`` and ``libE_stats.txt`` log files. - - **gen_workers** [list of ints]: - List of workers that should run only generators. All other workers will run - only simulator functions. - - .. tab-item:: Directories - - .. tab-set:: - - .. tab-item:: General - - **use_workflow_dir** [bool] = ``False``: - Whether to place *all* log files, dumped arrays, and default ensemble-directories in a - separate ``workflow`` directory. Each run is suffixed with a hash. - If copying back an ensemble directory from another location, the copy is placed here. - - **workflow_dir_path** [str]: - Optional path to the workflow directory. - - **ensemble_dir_path** [str] = ``"./ensemble"``: - Path to main ensemble directory. Can serve - as single working directory for workers, or contain calculation directories. - - .. code-block:: python - - LibeSpecs.ensemble_dir_path = "/scratch/my_ensemble" - - **ensemble_copy_back** [bool] = ``False``: - Whether to copy back contents of ``ensemble_dir_path`` to launch - location. Useful if ``ensemble_dir_path`` is located on node-local storage. - - **reuse_output_dir** [bool] = ``False``: - Whether to allow overwrites and access to previous ensemble and workflow directories in subsequent runs. - ``False`` by default to protect results. - - **calc_dir_id_width** [int] = ``4``: - The width of the numerical ID component of a calculation directory name. Leading - zeros are padded to the sim/gen ID. - - **use_worker_dirs** [bool] = ``False``: - Whether to organize calculation directories under worker-specific directories: - - .. tab-set:: - - .. tab-item:: False - - .. code-block:: - - - /ensemble_dir - - /sim0000 - - /gen0001 - - /sim0001 - ... - - .. tab-item:: True - - .. code-block:: - - - /ensemble_dir - - /worker1 - - /sim0000 - - /gen0001 - - /sim0004 - ... - - /worker2 - ... - - .. tab-item:: Sims - - **sim_dirs_make** [bool] = ``False``: - Whether to make calculation directories for each simulation function call. - - **sim_dir_copy_files** [list]: - Paths to files or directories to copy into each sim directory, or ensemble directory. - List of strings or ``pathlib.Path`` objects. - - **sim_dir_symlink_files** [list]: - Paths to files or directories to symlink into each sim directory, or ensemble directory. - List of strings or ``pathlib.Path`` objects. - - **sim_input_dir** [str]: - Copy this directory's contents into the working directory upon calling the simulation function. - Forms the base of a simulation directory. - - .. tab-item:: Gens - - **gen_dirs_make** [bool] = ``False``: - Whether to make generator-specific calculation directories for each generator function call. - *Each persistent generator creates a single directory*. - - **gen_dir_copy_files** [list]: - Paths to copy into the working directory upon calling the generator function. - List of strings or ``pathlib.Path`` objects - - **gen_dir_symlink_files** [list]: - Paths to files or directories to symlink into each gen directory. - List of strings or ``pathlib.Path`` objects - - **gen_input_dir** [str]: - Copy this directory's contents into the working directory upon calling the generator function. - Forms the base of a generator directory. - - .. tab-item:: Profiling - - **profile** [bool] = ``False``: - Profile manager and worker logic using ``cProfile``. - - **safe_mode** [bool] = ``True``: - Prevents user functions from overwriting internal fields, but requires moderate overhead. - - **stats_fmt** [dict]: - A dictionary of options for formatting ``"libE_stats.txt"``. - See "Formatting Options for libE_stats.txt". - - **live_data** [LiveData] = None: - Add a live data capture object (e.g., for plotting). - - .. tab-item:: TCP - - **workers** [list]: - TCP Only: A list of worker hostnames. - - **ip** [str]: - TCP Only: IP address for Manager's system. - - **port** [int]: - TCP Only: Port number for Manager's system. - - **authkey** [str]: - TCP Only: Authkey for Manager's system. - - **workerID** [int]: - TCP Only: Worker ID number assigned to the new process. - - **worker_cmd** [list]: - TCP Only: Split string corresponding to worker/client Python process invocation. Contains - a local Python path, calling script, and manager/server format-fields for ``manager_ip``, - ``manager_port``, ``authkey``, and ``workerID``. ``nworkers`` is specified normally. - - .. tab-item:: History - - **save_every_k_sims** [int]: - Save history array to file after every k simulated points. - - **save_every_k_gens** [int]: - Save history array to file after every k generated points. - - **save_H_and_persis_on_abort** [bool] = ``True``: - Save states of ``H`` and ``persis_info`` to file on aborting after an exception. - - **save_H_on_completion** bool | None = ``False`` - Save state of ``H`` to file upon completing a workflow. Also enabled when either ``save_every_k_sims`` - or ``save_every_k_gens`` is set. - - **save_H_with_date** bool | None = ``False`` - Save ``H`` filename contains date and timestamp. - - **H_file_prefix** str | None = ``"libE_history"`` - Prefix for ``H`` filename. - - **final_gen_send** [bool] = ``False``: - Send final simulation results to persistent generators before shutdown. - The results will be sent along with the ``PERSIS_STOP`` tag. - - .. tab-item:: Resources - - **disable_resource_manager** [bool] = ``False``: - Disable the built-in resource manager, including automatic resource detection - and/or assignment of resources to workers. ``"resource_info"`` will be ignored. - - **platform** [str]: - Name of a :ref:`known platform`, e.g., ``LibeSpecs.platform = "perlmutter_g"`` - Alternatively set the ``LIBE_PLATFORM`` environment variable. - - **platform_specs** [Platform|dict]: - A ``Platform`` object (or dictionary) specifying :ref:`settings for a platform.`. - Fields not provided will be auto-detected. Can be set to a :ref:`known platform object`. - - **num_resource_sets** [int]: - The total number of resource sets into which resources will be divided. - By default resources will be divided by workers (excluding - ``zero_resource_workers``). - - **gen_num_procs** [int] = ``0``: - The default number of processors (MPI ranks) required by generators. Unless - overridden by equivalent ``persis_info`` settings, generators will be allocated - this many processors for applications launched via the MPIExecutor. - - **gen_num_gpus** [int] = ``0``: - The default number of GPUs required by generators. Unless overridden by - the equivalent ``persis_info`` settings, generators will be allocated this - many GPUs. - - **gpus_per_group** [int]: - Number of GPUs for each group in the scheduler. This can be used when - running on nodes with different numbers of GPUs. In effect a - block of this many GPUs will be treated as a virtual node. - By default the GPUs on each node are treated as a group. - - **use_tiles_as_gpus** [bool] = ``False``: - If ``True`` then treat a GPU tile as one GPU, assuming - ``tiles_per_GPU`` is provided in ``platform_specs`` or detected. - - **enforce_worker_core_bounds** [bool] = ``False``: - Permit submission of tasks with a - higher processor count than the CPUs available to the worker. - Larger node counts are not allowed. Ignored when - ``disable_resource_manager`` is set. - - **dedicated_mode** [bool] = ``False``: - Instructs libEnsemble’s MPI executor not to run applications on nodes where - libEnsemble processes (manager and workers) are running. - - **zero_resource_workers** [list of ints]: - List of workers (by IDs) that require no resources. For when a fixed mapping of workers - to resources is required. Otherwise, use ``num_resource_sets``. - For use with supported allocation functions. - - **resource_info** [dict]: - Provide resource information that will override automatically detected resources. - The allowable fields are given below in "Overriding Resource Auto-Detection" - Ignored if ``disable_resource_manager`` is set. - - **scheduler_opts** [dict]: - Options for the resource scheduler. - See "Scheduler Options" for more options. - -.. dropdown:: Complete Class API - - .. autopydantic_model:: libensemble.specs.LibeSpecs - :model-show-json: False - :model-show-config-member: False - :model-show-config-summary: False - :model-show-validator-members: False - :model-show-validator-summary: False - :field-list-validators: False - :model-show-field-summary: False - -Scheduler Options ------------------ - -See options for :ref:`built-in scheduler`. - -.. _resource_info: - -Overriding Resource Auto-Detection ----------------------------------- - -Note that ``"cores_on_node"`` and ``"gpus_on_node"`` are supported for backward -compatibility, but use of :ref:`Platform specification` is -recommended for these settings. - -.. dropdown:: Resource Info Fields - - The allowable ``libE_specs["resource_info"]`` fields are:: - - "cores_on_node" [tuple (int, int)]: - Tuple (physical cores, logical cores) on nodes. - - "gpus_on_node" [int]: - Number of GPUs on each node. - - "node_file" [str]: - Name of file containing a node-list. Default is "node_list". - - "nodelist_env_slurm" [str]: - The environment variable giving a node list in Slurm format - (Default: Uses ``SLURM_NODELIST``). Queried only if - a ``node_list`` file is not provided and the resource manager is - enabled. - - "nodelist_env_cobalt" [str]: - The environment variable giving a node list in Cobalt format - (Default: Uses ``COBALT_PARTNAME``) Queried only - if a ``node_list`` file is not provided and the resource manager - is enabled. - - "nodelist_env_lsf" [str]: - The environment variable giving a node list in LSF format - (Default: Uses ``LSB_HOSTS``) Queried only - if a ``node_list`` file is not provided and the resource manager - is enabled. - - "nodelist_env_lsf_shortform" [str]: - The environment variable giving a node list in LSF short-form - format (Default: Uses ``LSB_MCPU_HOSTS``) Queried only - if a ``node_list`` file is not provided and the resource manager is - enabled. - - For example:: - - customizer = {cores_on_node": (16, 64), - "node_file": "libe_nodes"} - - libE_specs["resource_info"] = customizer - -Formatting Options for libE_stats File --------------------------------------- - -The allowable ``libE_specs["stats_fmt"]`` fields are:: - - "task_timing" [bool] = ``False``: - Outputs elapsed time for each task launched by the executor. - - "task_datetime" [bool] = ``False``: - Outputs the elapsed time and start and end time for each task launched by the executor. - Can be used with the ``"plot_libe_tasks_util_v_time.py"`` to give task utilization plots. - - "show_resource_sets" [bool] = ``False``: - Shows the resource set IDs assigned to each worker for each call of the user function. diff --git a/docs/data_structures/libE_specs/libE_specs.rst b/docs/data_structures/libE_specs/libE_specs.rst new file mode 100644 index 0000000000..a219109851 --- /dev/null +++ b/docs/data_structures/libE_specs/libE_specs.rst @@ -0,0 +1,108 @@ +.. _datastruct-libe-specs: + +**Introduction** \|\| `General `__ \|\| `Directories `__ \|\| `Profiling `__ \|\| `TCP `__ \|\| `History `__ \|\| `Resources `__ + +LibE Specs +========== + +libEnsemble is primarily customized by setting options within a ``LibeSpecs`` instance. + +.. code-block:: python + + from libensemble.specs import LibeSpecs + + specs = LibeSpecs(save_every_k_gens=100, sim_dirs_make=True, nworkers=4) + +.. toctree:: + :hidden: + + libE_specs_general + libE_specs_directories + libE_specs_profiling + libE_specs_tcp + libE_specs_history + libE_specs_resources + +.. dropdown:: Complete Class API + + .. autopydantic_model:: libensemble.specs.LibeSpecs + :model-show-json: False + :model-show-config-member: False + :model-show-config-summary: False + :model-show-validator-members: False + :model-show-validator-summary: False + :field-list-validators: False + :model-show-field-summary: False + +Scheduler Options +----------------- + +See options for :ref:`built-in scheduler`. + +.. _resource_info: + +Overriding Resource Auto-Detection +---------------------------------- + +Note that ``"cores_on_node"`` and ``"gpus_on_node"`` are supported for backward +compatibility, but use of :ref:`Platform specification` is +recommended for these settings. + +.. dropdown:: Resource Info Fields + + The allowable ``libE_specs["resource_info"]`` fields are:: + + "cores_on_node" [tuple (int, int)]: + Tuple (physical cores, logical cores) on nodes. + + "gpus_on_node" [int]: + Number of GPUs on each node. + + "node_file" [str]: + Name of file containing a node-list. Default is "node_list". + + "nodelist_env_slurm" [str]: + The environment variable giving a node list in Slurm format + (Default: Uses ``SLURM_NODELIST``). Queried only if + a ``node_list`` file is not provided and the resource manager is + enabled. + + "nodelist_env_cobalt" [str]: + The environment variable giving a node list in Cobalt format + (Default: Uses ``COBALT_PARTNAME``) Queried only + if a ``node_list`` file is not provided and the resource manager + is enabled. + + "nodelist_env_lsf" [str]: + The environment variable giving a node list in LSF format + (Default: Uses ``LSB_HOSTS``) Queried only + if a ``node_list`` file is not provided and the resource manager + is enabled. + + "nodelist_env_lsf_shortform" [str]: + The environment variable giving a node list in LSF short-form + format (Default: Uses ``LSB_MCPU_HOSTS``) Queried only + if a ``node_list`` file is not provided and the resource manager is + enabled. + + For example:: + + customizer = {cores_on_node": (16, 64), + "node_file": "libe_nodes"} + + libE_specs["resource_info"] = customizer + +Formatting Options for libE_stats File +-------------------------------------- + +The allowable ``libE_specs["stats_fmt"]`` fields are:: + + "task_timing" [bool] = ``False``: + Outputs elapsed time for each task launched by the executor. + + "task_datetime" [bool] = ``False``: + Outputs the elapsed time and start and end time for each task launched by the executor. + Can be used with the ``"plot_libe_tasks_util_v_time.py"`` to give task utilization plots. + + "show_resource_sets" [bool] = ``False``: + Shows the resource set IDs assigned to each worker for each call of the user function. diff --git a/docs/data_structures/libE_specs/libE_specs_directories.rst b/docs/data_structures/libE_specs/libE_specs_directories.rst new file mode 100644 index 0000000000..76c848da05 --- /dev/null +++ b/docs/data_structures/libE_specs/libE_specs_directories.rst @@ -0,0 +1,99 @@ +Directories +=========== + +`Introduction `__ \|\| `General `__ \|\| **Directories** \|\| `Profiling `__ \|\| `TCP `__ \|\| `History `__ \|\| `Resources `__ + +.. tab-set:: + + .. tab-item:: General + + **use_workflow_dir** [bool] = ``False``: + Whether to place *all* log files, dumped arrays, and default ensemble-directories in a + separate ``workflow`` directory. Each run is suffixed with a hash. + If copying back an ensemble directory from another location, the copy is placed here. + + **workflow_dir_path** [str]: + Optional path to the workflow directory. + + **ensemble_dir_path** [str] = ``"./ensemble"``: + Path to main ensemble directory. Can serve + as single working directory for workers, or contain calculation directories. + + .. code-block:: python + + LibeSpecs.ensemble_dir_path = "/scratch/my_ensemble" + + **ensemble_copy_back** [bool] = ``False``: + Whether to copy back contents of ``ensemble_dir_path`` to launch + location. Useful if ``ensemble_dir_path`` is located on node-local storage. + + **reuse_output_dir** [bool] = ``False``: + Whether to allow overwrites and access to previous ensemble and workflow directories in subsequent runs. + ``False`` by default to protect results. + + **calc_dir_id_width** [int] = ``4``: + The width of the numerical ID component of a calculation directory name. Leading + zeros are padded to the sim/gen ID. + + **use_worker_dirs** [bool] = ``False``: + Whether to organize calculation directories under worker-specific directories: + + .. tab-set:: + + .. tab-item:: False + + .. code-block:: + + - /ensemble_dir + - /sim0000 + - /gen0001 + - /sim0001 + ... + + .. tab-item:: True + + .. code-block:: + + - /ensemble_dir + - /worker1 + - /sim0000 + - /gen0001 + - /sim0004 + ... + - /worker2 + ... + + .. tab-item:: Sims + + **sim_dirs_make** [bool] = ``False``: + Whether to make calculation directories for each simulation function call. + + **sim_dir_copy_files** [list]: + Paths to files or directories to copy into each sim directory, or ensemble directory. + List of strings or ``pathlib.Path`` objects. + + **sim_dir_symlink_files** [list]: + Paths to files or directories to symlink into each sim directory, or ensemble directory. + List of strings or ``pathlib.Path`` objects. + + **sim_input_dir** [str]: + Copy this directory's contents into the working directory upon calling the simulation function. + Forms the base of a simulation directory. + + .. tab-item:: Gens + + **gen_dirs_make** [bool] = ``False``: + Whether to make generator-specific calculation directories for each generator function call. + *Each persistent generator creates a single directory*. + + **gen_dir_copy_files** [list]: + Paths to copy into the working directory upon calling the generator function. + List of strings or ``pathlib.Path`` objects + + **gen_dir_symlink_files** [list]: + Paths to files or directories to symlink into each gen directory. + List of strings or ``pathlib.Path`` objects + + **gen_input_dir** [str]: + Copy this directory's contents into the working directory upon calling the generator function. + Forms the base of a generator directory. diff --git a/docs/data_structures/libE_specs/libE_specs_general.rst b/docs/data_structures/libE_specs/libE_specs_general.rst new file mode 100644 index 0000000000..f7f07f75fa --- /dev/null +++ b/docs/data_structures/libE_specs/libE_specs_general.rst @@ -0,0 +1,40 @@ +General +======= + +`Introduction `__ \|\| **General** \|\| `Directories `__ \|\| `Profiling `__ \|\| `TCP `__ \|\| `History `__ \|\| `Resources `__ + +**comms** [str] = ``"mpi"``: + Manager/Worker communications mode: ``'mpi'``, ``'local'``, ``'threads'``, or ``'tcp'``. + If ``nworkers`` is specified, then ``local`` comms will be used unless a + parallel MPI environment is detected. + +**nworkers** [int]: + Number of worker processes in ``"local"``, ``"threads"``, or ``"tcp"``. + +**gen_on_worker** [bool] = False + Instructs Worker process to run generator instead of Manager. + +**mpi_comm** [MPI communicator] = ``MPI.COMM_WORLD``: + libEnsemble MPI communicator. + +**dry_run** [bool] = ``False``: + Whether libEnsemble should immediately exit after validating all inputs. + +**abort_on_exception** [bool] = ``True``: + In MPI mode, whether to call ``MPI_ABORT`` on an exception. + If ``False``, an exception will be raised by the manager. + +**worker_timeout** [int] = ``1``: + On libEnsemble shutdown, number of seconds after which workers considered timed out, + then terminated. + +**kill_canceled_sims** [bool] = ``False``: + Try to kill sims with ``cancel_requested`` set to ``True``. + If ``False``, the manager avoids this moderate overhead. + +**disable_log_files** [bool] = ``False``: + Disable ``ensemble.log`` and ``libE_stats.txt`` log files. + +**gen_workers** [list of ints]: + List of workers that should run only generators. All other workers will run + only simulator functions. diff --git a/docs/data_structures/libE_specs/libE_specs_history.rst b/docs/data_structures/libE_specs/libE_specs_history.rst new file mode 100644 index 0000000000..55e9089696 --- /dev/null +++ b/docs/data_structures/libE_specs/libE_specs_history.rst @@ -0,0 +1,27 @@ +History +======= + +`Introduction `__ \|\| `General `__ \|\| `Directories `__ \|\| `Profiling `__ \|\| `TCP `__ \|\| **History** \|\| `Resources `__ + +**save_every_k_sims** [int]: + Save history array to file after every k simulated points. + +**save_every_k_gens** [int]: + Save history array to file after every k generated points. + +**save_H_and_persis_on_abort** [bool] = ``True``: + Save states of ``H`` and ``persis_info`` to file on aborting after an exception. + +**save_H_on_completion** [bool] = ``False``: + Save state of ``H`` to file upon completing a workflow. Also enabled when either ``save_every_k_sims`` + or ``save_every_k_gens`` is set. + +**save_H_with_date** [bool] = ``False``: + ``H`` filename contains date and timestamp. + +**H_file_prefix** [str] = ``"libE_history"``: + Prefix for ``H`` filename. + +**final_gen_send** [bool] = ``False``: + Send final simulation results to persistent generators before shutdown. + The results will be sent along with the ``PERSIS_STOP`` tag. diff --git a/docs/data_structures/libE_specs/libE_specs_profiling.rst b/docs/data_structures/libE_specs/libE_specs_profiling.rst new file mode 100644 index 0000000000..6a855c8ce6 --- /dev/null +++ b/docs/data_structures/libE_specs/libE_specs_profiling.rst @@ -0,0 +1,17 @@ +Profiling +========= + +`Introduction `__ \|\| `General `__ \|\| `Directories `__ \|\| **Profiling** \|\| `TCP `__ \|\| `History `__ \|\| `Resources `__ + +**profile** [bool] = ``False``: + Profile manager and worker logic using ``cProfile``. + +**safe_mode** [bool] = ``False``: + Prevents user functions from overwriting protected History fields, but requires moderate overhead. + +**stats_fmt** [dict]: + A dictionary of options for formatting ``"libE_stats.txt"``. + See "Formatting Options for libE_stats.txt". + +**live_data** [LiveData] = None: + Add a live data capture object (e.g., for plotting). diff --git a/docs/data_structures/libE_specs/libE_specs_resources.rst b/docs/data_structures/libE_specs/libE_specs_resources.rst new file mode 100644 index 0000000000..6b6118d663 --- /dev/null +++ b/docs/data_structures/libE_specs/libE_specs_resources.rst @@ -0,0 +1,60 @@ +Resources +========= + +`Introduction `__ \|\| `General `__ \|\| `Directories `__ \|\| `Profiling `__ \|\| `TCP `__ \|\| `History `__ \|\| **Resources** + +**disable_resource_manager** [bool] = ``False``: + Disable the built-in resource manager, including automatic resource detection + and/or assignment of resources to workers. ``"resource_info"`` will be ignored. + +**platform** [str]: + Name of a :ref:`known platform`, e.g., ``LibeSpecs.platform = "perlmutter_g"`` + Alternatively set the ``LIBE_PLATFORM`` environment variable. + +**platform_specs** [Platform|dict]: + A ``Platform`` object (or dictionary) specifying :ref:`settings for a platform.`. + Fields not provided will be auto-detected. Can be set to a :ref:`known platform object`. + +**num_resource_sets** [int]: + The total number of resource sets into which resources will be divided. + By default resources will be divided by workers (excluding + ``zero_resource_workers``). + +**gen_num_procs** [int] = ``0``: + The default number of processors (MPI ranks) required by generators. Unless + overridden by equivalent ``persis_info`` settings, generators will be allocated + this many processors for applications launched via the MPIExecutor. + +**gen_num_gpus** [int] = ``0``: + The default number of GPUs required by generators. Unless overridden by + the equivalent ``persis_info`` settings, generators will be allocated this + many GPUs. + +**gpus_per_group** [int]: + Number of GPUs for each group in the scheduler. This can be used when + running on nodes with different numbers of GPUs. In effect a + block of this many GPUs will be treated as a virtual node. + By default the GPUs on each node are treated as a group. + +**use_tiles_as_gpus** [bool] = ``False``: + If ``True`` then treat a GPU tile as one GPU when GPU tiles + are provided in ``platform_specs`` or auto-detected. + +**enforce_worker_core_bounds** [bool] = ``False``: + Permit submission of tasks with a + higher processor count than the CPUs available to the worker. + Larger node counts are not allowed. Ignored when + ``disable_resource_manager`` is set. + +**dedicated_mode** [bool] = ``False``: + Instructs libEnsemble’s MPI executor not to run applications on nodes where + libEnsemble processes (manager and workers) are running. + +**resource_info** [dict]: + Provide resource information that will override automatically detected resources. + The allowable fields are given below in "Overriding Resource Auto-Detection" + Ignored if ``disable_resource_manager`` is set. + +**scheduler_opts** [dict]: + Options for the resource scheduler. + See "Scheduler Options" for more options. diff --git a/docs/data_structures/libE_specs/libE_specs_tcp.rst b/docs/data_structures/libE_specs/libE_specs_tcp.rst new file mode 100644 index 0000000000..d0d2a05655 --- /dev/null +++ b/docs/data_structures/libE_specs/libE_specs_tcp.rst @@ -0,0 +1,24 @@ +TCP +=== + +`Introduction `__ \|\| `General `__ \|\| `Directories `__ \|\| `Profiling `__ \|\| **TCP** \|\| `History `__ \|\| `Resources `__ + +**workers** [list]: + TCP Only: A list of worker hostnames. + +**ip** [str]: + TCP Only: IP address for Manager's system. + +**port** [int]: + TCP Only: Port number for Manager's system. + +**authkey** [str]: + TCP Only: Authkey for Manager's system. + +**workerID** [int]: + TCP Only: Worker ID number assigned to the new process. + +**worker_cmd** [list]: + TCP Only: Split string corresponding to worker/client Python process invocation. Contains + a local Python path, calling script, and manager/server format-fields for ``manager_ip``, + ``manager_port``, ``authkey``, and ``workerID``. ``nworkers`` is specified normally. diff --git a/docs/data_structures/persis_info.rst b/docs/data_structures/persis_info.rst index d5327241f5..6e620f886c 100644 --- a/docs/data_structures/persis_info.rst +++ b/docs/data_structures/persis_info.rst @@ -21,42 +21,44 @@ between ensemble invocations, or in the allocation function. Examples: -.. tab-set:: - - .. tab-item:: RNG or reusable structures - - .. literalinclude:: ../../libensemble/gen_funcs/sampling.py - :linenos: - :start-at: def uniform_random_sample(_, persis_info, gen_specs): - :end-before: def uniform_random_sample_with_variable_resources(_, persis_info, gen_specs): - :emphasize-lines: 17 - :caption: libensemble/libensemble/gen_funcs/sampling.py - - .. tab-item:: Incrementing indexes or process counts - - .. literalinclude:: ../../libensemble/alloc_funcs/fast_alloc.py - :linenos: - :start-at: for wid in support.avail_worker_ids(gen_workers=False): - :end-before: # Give gen work if possible - :caption: libensemble/alloc_funcs/fast_alloc.py - - .. tab-item:: Tracking running generators - - .. literalinclude:: ../../libensemble/alloc_funcs/start_only_persistent.py - :linenos: - :start-at: avail_workers = support.avail_worker_ids(persistent=False, zero_resource_workers=True, gen_workers=True) - :end-before: return Work, persis_info, 0 - :emphasize-lines: 18 - :caption: libensemble/alloc_funcs/start_only_persistent.py - - .. tab-item:: Allocation function triggers shutdown - - .. literalinclude:: ../../libensemble/alloc_funcs/start_only_persistent.py - :linenos: - :start-at: if gen_count < persis_info.get("num_gens_started", 0): - :end-before: # Give evaluated results back to a running persistent gen - :emphasize-lines: 1 - :caption: libensemble/alloc_funcs/start_only_persistent.py +RNG or reusable structures +-------------------------- + +.. literalinclude:: ../../libensemble/gen_funcs/sampling.py + :linenos: + :start-at: def uniform_random_sample(_, persis_info, gen_specs, libE_info): + :end-before: def uniform_random_sample_with_variable_resources(_, persis_info, gen_specs, libE_info): + :emphasize-lines: 10 + :caption: libensemble/libensemble/gen_funcs/sampling.py + +Incrementing indexes or process counts +-------------------------------------- + +.. literalinclude:: ../../libensemble/alloc_funcs/fast_alloc.py + :linenos: + :start-at: for wid in support.avail_worker_ids(gen_workers=False): + :end-before: # Give gen work if possible + :caption: libensemble/alloc_funcs/fast_alloc.py + +Tracking running generators +--------------------------- + +.. literalinclude:: ../../libensemble/alloc_funcs/start_only_persistent.py + :linenos: + :start-at: avail_workers = support.avail_worker_ids(persistent=False, gen_workers=True) + :end-before: return Work, persis_info, 0 + :emphasize-lines: 18 + :caption: libensemble/alloc_funcs/start_only_persistent.py + +Allocation function triggers shutdown +------------------------------------- + +.. literalinclude:: ../../libensemble/alloc_funcs/start_only_persistent.py + :linenos: + :start-at: if gen_count < persis_info.get("num_gens_started", 0): + :end-before: # Give evaluated results back to a running persistent gen + :emphasize-lines: 1 + :caption: libensemble/alloc_funcs/start_only_persistent.py .. - Random number generators or other structures for use on consecutive calls .. - Incrementing array row indexes or process counts diff --git a/docs/data_structures/platform_specs.rst b/docs/data_structures/platform_specs.rst index 35198535f1..bfc4104059 100644 --- a/docs/data_structures/platform_specs.rst +++ b/docs/data_structures/platform_specs.rst @@ -15,37 +15,37 @@ A ``Platform`` object or dictionary specifying settings for a platform. To define a platform (in calling script): -.. tab-set:: +Platform Object +^^^^^^^^^^^^^^^ - .. tab-item:: Platform Object - - .. code-block:: python +.. code-block:: python - from libensemble.resources.platforms import Platform + from libensemble.resources.platforms import Platform - libE_specs["platform_specs"] = Platform( - mpi_runner="srun", - cores_per_node=64, - logical_cores_per_node=128, - gpus_per_node=8, - gpu_setting_type="runner_default", - gpu_env_fallback="ROCR_VISIBLE_DEVICES", - scheduler_match_slots=False, - ) + libE_specs["platform_specs"] = Platform( + mpi_runner="srun", + cores_per_node=64, + logical_cores_per_node=128, + gpus_per_node=8, + gpu_setting_type="runner_default", + gpu_env_fallback="ROCR_VISIBLE_DEVICES", + scheduler_match_slots=False, + ) - .. tab-item:: Dictionary +Dictionary +^^^^^^^^^^ - .. code-block:: python +.. code-block:: python - libE_specs["platform_specs"] = { - "mpi_runner": "srun", - "cores_per_node": 64, - "logical_cores_per_node": 128, - "gpus_per_node": 8, - "gpu_setting_type": "runner_default", - "gpu_env_fallback": "ROCR_VISIBLE_DEVICES", - "scheduler_match_slots": False, - } + libE_specs["platform_specs"] = { + "mpi_runner": "srun", + "cores_per_node": 64, + "logical_cores_per_node": 128, + "gpus_per_node": 8, + "gpu_setting_type": "runner_default", + "gpu_env_fallback": "ROCR_VISIBLE_DEVICES", + "scheduler_match_slots": False, + } The list of platform fields is given below. Any fields not given will be auto-detected by libEnsemble. diff --git a/docs/data_structures/sim_specs.rst b/docs/data_structures/sim_specs.rst index 9a023f5491..0c937c5e82 100644 --- a/docs/data_structures/sim_specs.rst +++ b/docs/data_structures/sim_specs.rst @@ -3,17 +3,38 @@ Simulation Specs ================ -Used to specify the simulation, its inputs and outputs, and user data. +Used to specify the simulation function, its inputs and outputs, and user data. + +Standardized (gest-api) +----------------------- .. code-block:: python :linenos: - ... from libensemble import SimSpecs - from simulator import sim_find_sine + from gest_api.vocs import VOCS + from my_package import my_sim_callable + vocs = VOCS( + variables={"x": [-3.0, 3.0]}, + objectives={"y": "MINIMIZE"}, + ) + + sim_specs = SimSpecs( + simulator=my_sim_callable, + vocs=vocs, + ) ... +Classic (sim_f) +--------------- + +.. code-block:: python + :linenos: + + from libensemble import SimSpecs + from simulator import sim_find_sine + sim_specs = SimSpecs( sim_f=sim_find_sine, inputs=["x"], diff --git a/docs/dev_guide/dev_API/developer_API.rst b/docs/dev_guide/dev_API/developer_API.rst index c09647db46..6774cbd629 100644 --- a/docs/dev_guide/dev_API/developer_API.rst +++ b/docs/dev_guide/dev_API/developer_API.rst @@ -17,3 +17,5 @@ This section documents the internal modules of libEnsemble. node_resources_module mpi_resources_module scheduler_module + work_dict + worker_array diff --git a/docs/function_guides/work_dict.rst b/docs/dev_guide/dev_API/work_dict.rst similarity index 96% rename from docs/function_guides/work_dict.rst rename to docs/dev_guide/dev_API/work_dict.rst index 4252919de0..0afeebabfb 100644 --- a/docs/function_guides/work_dict.rst +++ b/docs/dev_guide/dev_API/work_dict.rst @@ -21,7 +21,7 @@ the data given to worker ``i``. Populated in the allocation function. ``Work[i]` "persistent" [bool]: True if worker i will enter persistent mode (Default: False) The work dictionary is typically set using the ``gen_work`` or ``sim_work`` -:doc:`helper functions<../function_guides/allocator>` in the allocation function. +:doc:`helper functions<../../function_guides/allocator>` in the allocation function. ``H_fields``, for example, is usually packed from either ``sim_specs["in"]``, ``gen_specs["in"]`` or the equivalent "persis_in" variants. diff --git a/docs/function_guides/worker_array.rst b/docs/dev_guide/dev_API/worker_array.rst similarity index 100% rename from docs/function_guides/worker_array.rst rename to docs/dev_guide/dev_API/worker_array.rst diff --git a/docs/examples/alloc_funcs.rst b/docs/examples/alloc_funcs.rst deleted file mode 100644 index 3734d7cb0d..0000000000 --- a/docs/examples/alloc_funcs.rst +++ /dev/null @@ -1,104 +0,0 @@ -.. _examples-alloc: - -Allocation Functions -==================== - -Below are example allocation functions available in libEnsemble. - -Many users use these unmodified. - -.. IMPORTANT:: - See the API for allocation functions :ref:`here`. - - **The default allocation function changed in libEnsemble v2.0 from `give_sim_work_first` to `start_only_persistent `.** - -.. note:: - - The default allocation function for persistent generators is :ref:`start_only_persistent`. - - The most commonly used allocation function for non-persistent generators is :ref:`give_sim_work_first`. - -.. role:: underline - :class: underline - -.. _start_only_persistent_label: - -start_only_persistent ---------------------- -.. automodule:: start_only_persistent - :members: - :undoc-members: - -.. dropdown:: :underline:`start_only_persistent.py` - - .. literalinclude:: ../../libensemble/alloc_funcs/start_only_persistent.py - :language: python - :linenos: - -.. _gswf_label: - -give_sim_work_first -------------------- -.. automodule:: give_sim_work_first - :members: - :undoc-members: - -.. dropdown:: :underline:`give_sim_work_first.py` - - .. literalinclude:: ../../libensemble/alloc_funcs/give_sim_work_first.py - :language: python - :linenos: - -fast_alloc ----------- -.. automodule:: fast_alloc - :members: - :undoc-members: - -.. dropdown:: :underline:`fast_alloc.py` - - .. literalinclude:: ../../libensemble/alloc_funcs/fast_alloc.py - :language: python - :linenos: - -start_persistent_local_opt_gens -------------------------------- -.. automodule:: start_persistent_local_opt_gens - :members: - :undoc-members: - -fast_alloc_and_pausing ----------------------- -.. automodule:: fast_alloc_and_pausing - :members: - :undoc-members: - -only_one_gen_alloc ------------------- -.. automodule:: only_one_gen_alloc - :members: - :undoc-members: - -start_fd_persistent -------------------- -.. automodule:: start_fd_persistent - :members: - :undoc-members: - -persistent_aposmm_alloc ------------------------ -.. automodule:: persistent_aposmm_alloc - :members: - :undoc-members: - -give_pregenerated_work ----------------------- -.. automodule:: give_pregenerated_work - :members: - :undoc-members: - -inverse_bayes_allocf --------------------- -.. automodule:: inverse_bayes_allocf - :members: - :undoc-members: diff --git a/docs/examples/calling_scripts.rst b/docs/examples/calling_scripts.rst index 708a9d1280..394f3946c9 100644 --- a/docs/examples/calling_scripts.rst +++ b/docs/examples/calling_scripts.rst @@ -1,19 +1,12 @@ -Calling Scripts -=============== +Top-Level Scripts +================= -Below are example calling scripts used to populate specifications for each user -function and libEnsemble before initiating libEnsemble via the primary ``libE()`` -call. The primary libEnsemble-relevant portions have been highlighted in each -example. Non-highlighted portions may include setup routines, compilation steps -for user applications, or output processing. The first two scripts correspond to -random sampling calculations, while the third corresponds to an optimization routine. - -Many other examples of calling scripts can be found in libEnsemble's `regression tests`_. +Many other examples of top-level scripts can be found in libEnsemble's `regression tests`_. Local Sine Tutorial ------------------- -This example is from the Local Sine :doc:`Tutorial<../tutorials/local_sine_tutorial>`, +This example is from the Local Sine :doc:`Tutorial<../tutorials/local_sine_tutorial/local_sine_tutorial>`, meant to run with Python's multiprocessing as the primary ``comms`` method. .. literalinclude:: ../../examples/tutorials/simple_sine/test_local_sine_tutorial.py @@ -45,15 +38,18 @@ One worker runs a persistent generator and the other four run the forces simulat :caption: tests/scaling_tests/forces/forces_simple/run_libe_forces.py :linenos: -Persistent APOSMM with Gradients --------------------------------- +APOSMM with a Standardized Generator +-------------------------------------- -This example is also from the regression tests and demonstrates configuring a -persistent run via a custom allocation function. +This example from the regression tests demonstrates the v2.0 gest-api interface: +a standardized ``APOSMM`` generator class parameterized by a ``VOCS`` object, +paired with a gest-api ``simulator`` callable. The generator runs on the manager +thread by default, leaving all workers available for simulations. -.. literalinclude:: ../../libensemble/tests/regression_tests/test_persistent_aposmm_with_grad.py +.. literalinclude:: ../../libensemble/tests/regression_tests/test_asktell_aposmm_nlopt.py :language: python - :caption: tests/regression_tests/test_persistent_aposmm_with_grad.py + :caption: tests/regression_tests/test_asktell_aposmm_nlopt.py :linenos: + :end-at: workflow.exit_criteria = ExitCriteria(sim_max=2000) .. _regression tests: https://github.com/Libensemble/libensemble/tree/develop/libensemble/tests/regression_tests diff --git a/docs/examples/examples_index.rst b/docs/examples/examples_index.rst index 1e92e21c03..c1d6abfb28 100644 --- a/docs/examples/examples_index.rst +++ b/docs/examples/examples_index.rst @@ -2,7 +2,7 @@ Overview of Examples ==================== Here we give example generation, simulation, and allocation functions for -libEnsemble, as well as example calling scripts. +libEnsemble, as well as example top-level scripts. The examples come from the libEnsemble repository and the `libEnsemble Community Repository`_. @@ -12,7 +12,6 @@ The examples come from the libEnsemble repository and the `libEnsemble Community gen_funcs sim_funcs - alloc_funcs calling_scripts .. _libEnsemble Community Repository: https://github.com/Libensemble/libe-community-examples diff --git a/docs/examples/gen_funcs.rst b/docs/examples/gen_funcs.rst index c475cefe53..0bae6f7642 100644 --- a/docs/examples/gen_funcs.rst +++ b/docs/examples/gen_funcs.rst @@ -4,7 +4,7 @@ Generator Functions Here we list many generator functions included with libEnsemble. .. IMPORTANT:: - See the API for generator functions :ref:`here`. + See the API for generator functions :ref:`here`. Sampling -------- diff --git a/docs/examples/gest_api/aposmm.rst b/docs/examples/gest_api/aposmm.rst index a472ced11d..dbbd4f7ad1 100644 --- a/docs/examples/gest_api/aposmm.rst +++ b/docs/examples/gest_api/aposmm.rst @@ -7,20 +7,18 @@ APOSMM :show-inheritance: -.. seealso:: +APOSMM with libEnsemble +^^^^^^^^^^^^^^^^^^^^^^^ - .. tab-set:: +.. literalinclude:: ../../../libensemble/tests/regression_tests/test_asktell_aposmm_nlopt.py + :linenos: + :start-at: workflow = Ensemble(parse_args=True) + :end-before: # Perform the run - .. tab-item:: APOSMM with libEnsemble +APOSMM standalone +^^^^^^^^^^^^^^^^^ - .. literalinclude:: ../../../libensemble/tests/regression_tests/test_asktell_aposmm_nlopt.py - :linenos: - :start-at: workflow.libE_specs.gen_on_manager = True - :end-before: # Perform the run - - .. tab-item:: APOSMM standalone - - .. literalinclude:: ../../../libensemble/tests/unit_tests/test_persistent_aposmm.py - :linenos: - :start-at: def test_asktell_ingest_first(): - :end-before: assert persis_info.get("run_order"), "Standalone persistent_aposmm didn't do any localopt runs" +.. literalinclude:: ../../../libensemble/tests/unit_tests/test_persistent_aposmm.py + :linenos: + :start-at: def test_asktell_ingest_first(): + :end-before: assert persis_info.get("run_order"), "Standalone persistent_aposmm didn't do any localopt runs" diff --git a/docs/examples/persistent_sampling.rst b/docs/examples/persistent_sampling.rst index 7f778a8e8c..cf33eaa554 100644 --- a/docs/examples/persistent_sampling.rst +++ b/docs/examples/persistent_sampling.rst @@ -1,6 +1,9 @@ persistent_sampling ------------------- +.. role:: underline + :class: underline + .. automodule:: persistent_sampling :members: :undoc-members: diff --git a/docs/examples/sim_funcs.rst b/docs/examples/sim_funcs.rst index be4374d884..37fb6ecf14 100644 --- a/docs/examples/sim_funcs.rst +++ b/docs/examples/sim_funcs.rst @@ -8,7 +8,7 @@ function launching tasks, see the :doc:`Electrostatic Forces tutorial <../tutorials/executor_forces_tutorial>`. .. IMPORTANT:: - See the API for simulation functions :ref:`here`. + See the API for simulation functions :ref:`here`. .. role:: underline :class: underline @@ -60,5 +60,6 @@ Special simulation functions :maxdepth: 1 sim_funcs/mock_sim + sim_funcs/surmise_test_function .. _build_forces.sh: https://github.com/Libensemble/libensemble/blob/main/libensemble/tests/scaling_tests/forces/forces_app/build_forces.sh diff --git a/docs/executor/ex_base.rst b/docs/executor/ex_base.rst new file mode 100644 index 0000000000..1a4d3cf31d --- /dev/null +++ b/docs/executor/ex_base.rst @@ -0,0 +1,62 @@ +Base Executor +============= + +`Overview `__ \|\| **Base Executor** \|\| `MPI Executor `__ + +.. automodule:: executor + :no-undoc-members: + +Only for running local serial-launched applications. +To run MPI applications and use detected resources, use the `MPI Executor `__ tab. + +.. tab-set:: + + .. tab-item:: Base Executor + + .. autoclass:: libensemble.executors.executor.Executor + :members: + :exclude-members: serial_setup, sim_default_app, gen_default_app, get_app, default_app, set_resources, get_task, set_workerID, set_worker_info, new_tasks_timing, add_platform_info, set_gen_procs_gpus, kill, poll + + .. automethod:: __init__ + + .. tab-item:: Task + + .. _task_tag: + + Tasks are created and returned by the Executor's ``submit()``. Tasks + can be polled, killed, and waited on with the respective ``poll``, ``kill``, and ``wait`` functions. + Task information can be queried through instance attributes and query functions. + + .. autoclass:: libensemble.executors.executor.Task + :members: + :exclude-members: calc_task_timing, check_poll + + .. tab-item:: Task Attributes + + .. note:: + These should not be set directly. Tasks are launched by the Executor, + and task information can be queried through the task attributes + below and the query functions. + + :task.state: (string) The task status. One of + ("UNKNOWN"|"CREATED"|"WAITING"|"RUNNING"|"FINISHED"|"USER_KILLED"|"FAILED"|"FAILED_TO_START") + + :task.process: (process obj) The process object used by the underlying process + manager (e.g., return value of subprocess.Popen). + :task.errcode: (int) The error code (or return code) used by the underlying process manager. + :task.finished: (boolean) True means task has finished running - not whether it was successful. + :task.success: (boolean) Did task complete successfully (e.g., the return code is zero)? + :task.runtime: (int) Time in seconds that task has been running. + :task.submit_time: (int) Time since epoch that task was submitted. + :task.total_time: (int) Total time from task submission to completion (only available when task is finished). + + Run configuration attributes - some will be autogenerated: + + :task.workdir: (string) Work directory for the task + :task.name: (string) Name of task - autogenerated + :task.app: (app obj) Use application/executable, registered using exctr.register_app + :task.app_args: (string) Application arguments as a string + :task.stdout: (string) Name of file where the standard output of the task is written (in task.workdir) + :task.stderr: (string) Name of file where the standard error of the task is written (in task.workdir) + :task.dry_run: (boolean) True if task corresponds to dry run (no actual submission) + :task.runline: (string) Complete, parameterized command to be subprocessed to launch app diff --git a/docs/executor/ex_index.rst b/docs/executor/ex_index.rst index ee4698c21f..a4f33cb39a 100644 --- a/docs/executor/ex_index.rst +++ b/docs/executor/ex_index.rst @@ -1,5 +1,7 @@ .. _executor_index: +**Overview** \|\| `Base Executor `__ \|\| `MPI Executor `__ + Executors ========= @@ -7,10 +9,13 @@ libEnsemble's Executors can be used within user functions to provide a simple, portable interface for running and managing user applications. .. toctree:: - :maxdepth: 2 - :titlesonly: - :caption: libEnsemble Executors: + :hidden: + + ex_overview + ex_base + ex_mpi + +The **Executor** provides a portable interface for running applications on any system and +any number of compute resources. - overview - executor - mpi_executor +Please select from the sections above or the sidebar navigation to read more. diff --git a/docs/executor/mpi_executor.rst b/docs/executor/ex_mpi.rst similarity index 60% rename from docs/executor/mpi_executor.rst rename to docs/executor/ex_mpi.rst index 13773f5ad5..59a36f9e52 100644 --- a/docs/executor/mpi_executor.rst +++ b/docs/executor/ex_mpi.rst @@ -1,30 +1,24 @@ -MPI Executor - MPI apps -======================= +MPI Executor +============ -.. automodule:: mpi_executor - :no-undoc-members: +`Overview `__ \|\| `Base Executor `__ \|\| **MPI Executor** -See this :doc:`example` for usage. +.. automodule:: mpi_executor + :no-undoc-members: .. autoclass:: libensemble.executors.mpi_executor.MPIExecutor - :show-inheritance: - :inherited-members: - :exclude-members: serial_setup, sim_default_app, gen_default_app, get_app, default_app, set_resources, get_task, set_workerID, set_worker_info, new_tasks_timing, add_platform_info, set_gen_procs_gpus, kill, poll - -.. .. automethod:: __init__ - -.. :member-order: bysource -.. :members: __init__, register_app, submit, manager_poll + :show-inheritance: + :inherited-members: + :exclude-members: serial_setup, sim_default_app, gen_default_app, get_app, default_app, set_resources, get_task, set_workerID, set_worker_info, new_tasks_timing, add_platform_info, set_gen_procs_gpus, kill, poll -Class-specific Attributes -------------------------- +**Class-specific Attributes** Class-specific attributes can be set directly to alter the behavior of the MPI Executor. However, they should be used with caution, because they may not be implemented in other executors. :max_submit_attempts: (int) Maximum number of launch attempts for a given - task. *Default: 5*. + task. *Default: 5*. :fail_time: (int or float) *Only if wait_on_start is set.* Maximum run time to failure in seconds that results in relaunch. *Default: 2*. :retry_delay_incr: (int or float) Delay increment between launch attempts in seconds. diff --git a/docs/executor/overview.rst b/docs/executor/ex_overview.rst similarity index 79% rename from docs/executor/overview.rst rename to docs/executor/ex_overview.rst index 196ba38b8b..f53510b733 100644 --- a/docs/executor/overview.rst +++ b/docs/executor/ex_overview.rst @@ -1,11 +1,10 @@ -Executor Overview -================= +Overview +======== -Most computationally expensive libEnsemble workflows involve launching applications -from a :ref:`sim_f` or :ref:`gen_f` running on a worker to the -compute nodes of a supercomputer, cluster, or other compute resource. +**Overview** \|\| `Base Executor `__ \|\| `MPI Executor `__ -The **Executor** provides a portable interface for running applications on any system. +The **Executor** provides a portable interface for running applications on any system and +any number of compute resources. .. dropdown:: Detailed description @@ -37,10 +36,7 @@ The **Executor** provides a portable interface for running applications on any s ``app_name`` from registration in the calling script alongside other optional parameters described in the API. -Basic usage ------------ - -**In calling script** +**Basic usage** To set up an MPI executor, register an MPI application, and add to the ensemble object. @@ -54,10 +50,6 @@ to the ensemble object. exctr.register_app(full_path="/path/to/my/exe", app_name="sim1") ensemble = Ensemble(executor=exctr) -If using the ``libE()`` call, the Executor in the calling script does **not** -have to be passed to the ``libE()`` function. It is transferred via the -``Executor.executor`` class variable. - **In user simulation function**:: def sim_func(H, persis_info, sim_specs, libE_info): @@ -82,15 +74,11 @@ Example use-cases: * :doc:`Forces example with GPUs <../tutorials/forces_gpu_tutorial>`: Auto-assigns GPUs via executor. -See the :doc:`Executor` or :doc:`MPIExecutor` interface -for the complete API. - See :doc:`Running on HPC Systems<../platforms/platforms_index>` for illustrations of how common options such as ``libE_specs["dedicated_mode"]`` affect the run configuration on clusters and supercomputers. -Advanced Features ------------------ +**Advanced Features** **Example of polling output and killing application:** @@ -136,16 +124,6 @@ In simulation function (sim_f). print(task.state) # state may be finished/failed/killed -.. The Executor can also be retrieved using Python's ``with`` context switching statement, -.. although this is effectively syntactical sugar to above:: -.. -.. from libensemble.executors import Executor -.. -.. with Executor.executor as exctr: -.. task = exctr.submit(app_name="sim1", num_procs=8, app_args="input.txt", -.. stdout="out.txt", stderr="err.txt") -.. ... - Users who wish to poll only for manager kill signals and timeouts don't necessarily need to construct a polling loop like above, but can instead use the ``Executor`` built-in ``polling_loop()`` method. An alternative to the above simulation function @@ -178,10 +156,4 @@ which partitions resources among workers, ensuring that runs utilize different resources (e.g., nodes). Furthermore, the ``MPIExecutor`` offers resilience via the feature of re-launching tasks that fail to start because of system factors. -Various back-end mechanisms may be used by the Executor to best interact -with each system, including proxy launchers or task management systems. -Currently, these Executors launch at the application level within -an existing resource pool. However, submissions to a batch scheduler may be -supported in future Executors. - .. _concurrent futures: https://docs.python.org/library/concurrent.futures.html diff --git a/docs/executor/executor.rst b/docs/executor/executor.rst deleted file mode 100644 index 6784134a05..0000000000 --- a/docs/executor/executor.rst +++ /dev/null @@ -1,62 +0,0 @@ -Base Executor - Local apps -========================== - -.. automodule:: executor - :no-undoc-members: - -See the Executor APIs for optional arguments. - -.. tab-set:: - - .. tab-item:: Base Executor - - Only for running local serial-launched applications. - To run MPI applications and use detected resources, use the :doc:`MPIExecutor<../executor/mpi_executor>` - - .. autoclass:: libensemble.executors.executor.Executor - :members: - :exclude-members: serial_setup, sim_default_app, gen_default_app, get_app, default_app, set_resources, get_task, set_workerID, set_worker_info, new_tasks_timing, add_platform_info, set_gen_procs_gpus, kill, poll - - .. automethod:: __init__ - - .. tab-item:: Task - - .. _task_tag: - - Tasks are created and returned by the Executor's ``submit()``. Tasks - can be polled, killed, and waited on with the respective ``poll``, ``kill``, and ``wait`` functions. - Task information can be queried through instance attributes and query functions. - - .. autoclass:: libensemble.executors.executor.Task - :members: - :exclude-members: calc_task_timing, check_poll - - .. tab-item:: Task Attributes - - .. note:: - These should not be set directly. Tasks are launched by the Executor, - and task information can be queried through the task attributes - below and the query functions. - - :task.state: (string) The task status. One of - ("UNKNOWN"|"CREATED"|"WAITING"|"RUNNING"|"FINISHED"|"USER_KILLED"|"FAILED"|"FAILED_TO_START") - - :task.process: (process obj) The process object used by the underlying process - manager (e.g., return value of subprocess.Popen). - :task.errcode: (int) The error code (or return code) used by the underlying process manager. - :task.finished: (boolean) True means task has finished running - not whether it was successful. - :task.success: (boolean) Did task complete successfully (e.g., the return code is zero)? - :task.runtime: (int) Time in seconds that task has been running. - :task.submit_time: (int) Time since epoch that task was submitted. - :task.total_time: (int) Total time from task submission to completion (only available when task is finished). - - Run configuration attributes - some will be autogenerated: - - :task.workdir: (string) Work directory for the task - :task.name: (string) Name of task - autogenerated - :task.app: (app obj) Use application/executable, registered using exctr.register_app - :task.app_args: (string) Application arguments as a string - :task.stdout: (string) Name of file where the standard output of the task is written (in task.workdir) - :task.stderr: (string) Name of file where the standard error of the task is written (in task.workdir) - :task.dry_run: (boolean) True if task corresponds to dry run (no actual submission) - :task.runline: (string) Complete, parameterized command to be subprocessed to launch app diff --git a/docs/function_guides/allocator.rst b/docs/function_guides/allocator.rst index 0620105825..ec1189e84a 100644 --- a/docs/function_guides/allocator.rst +++ b/docs/function_guides/allocator.rst @@ -4,23 +4,21 @@ Allocation Functions ==================== Although the included allocation functions are sufficient for -most users, those who want to fine-tune how data or resources are allocated to their generator or simulator can write their own. +most users, those who want to fine-tune how data or resources +may be allocated to their generator or simulator can write their own. -The ``alloc_f`` is unique since it is called by libEnsemble's manager instead of a worker. +We encourage experimenting with: -For allocation functions, as with the other user functions, the level of complexity can -vary widely. We encourage experimenting with: - - 1. Prioritization of simulations - 2. Sending results immediately or in batch - 3. Assigning varying resources to evaluations +1. Prioritization of simulations +2. Sending results immediately or in batch +3. Assigning varying resources to evaluations .. dropdown:: Example .. literalinclude:: ../../libensemble/alloc_funcs/fast_alloc.py :caption: libensemble.alloc_funcs.fast_alloc.give_sim_work_first -Most ``alloc_f`` function definitions written by users resemble:: +The ``alloc_f`` function definition resembles:: def my_allocator(W, H, sim_specs, gen_specs, alloc_specs, persis_info, libE_info): @@ -35,14 +33,14 @@ Most users first check that it is appropriate to allocate work:: if libE_info["sim_max_given"] or not libE_info["any_idle_workers"]: return {}, persis_info -If the allocation is to continue, a support class is instantiated and a -:ref:`Work dictionary` is initialized:: +If the allocation is to continue, instantiate a support class to assist with the +:ref:`Work dictionary` construction:: manage_resources = "resource_sets" in H.dtype.names or libE_info["use_resource_sets"] support = AllocSupport(W, manage_resources, persis_info, libE_info) Work = {} -This Work dictionary is populated with integer keys ``wid`` for each worker and +The Work dictionary is populated with integer keys ``wid`` for each worker and dictionary values to give to those workers: .. dropdown:: Example ``Work`` @@ -126,10 +124,110 @@ or mark points for cancellation. The remaining values above are useful for efficient filtering of H values (e.g., ``sim_ended_count`` saves filtering by an entire column of H.) -Descriptions of included allocation functions can be found :doc:`here<../examples/alloc_funcs>`. The default allocation function is ``start_only_persistent``. During its worker ID loop, it checks if there's unallocated work and assigns simulations for that work. Otherwise, it initializes generators for up to ``"num_active_gens"`` instances. Other settings like ``batch_mode`` are also supported. See :ref:`here` for more information. + +.. _examples-alloc: + +Examples +======== + +Below are example allocation functions available in libEnsemble. + +Many users use these unmodified. + +.. IMPORTANT:: + The default allocation function changed in libEnsemble v2.0 from ``give_sim_work_first`` to ``start_only_persistent``. + +.. note:: + + The most commonly used allocation function for non-persistent generators is :ref:`give_sim_work_first`. + +.. role:: underline + :class: underline + +.. _start_only_persistent_label: + +start_only_persistent +--------------------- +.. automodule:: start_only_persistent + :members: + :undoc-members: + +.. dropdown:: :underline:`start_only_persistent.py` + + .. literalinclude:: ../../libensemble/alloc_funcs/start_only_persistent.py + :language: python + :linenos: + +.. _gswf_label: + +give_sim_work_first +------------------- +.. automodule:: give_sim_work_first + :members: + :undoc-members: + +.. dropdown:: :underline:`give_sim_work_first.py` + + .. literalinclude:: ../../libensemble/alloc_funcs/give_sim_work_first.py + :language: python + :linenos: + +fast_alloc +---------- +.. automodule:: fast_alloc + :members: + :undoc-members: + +.. dropdown:: :underline:`fast_alloc.py` + + .. literalinclude:: ../../libensemble/alloc_funcs/fast_alloc.py + :language: python + :linenos: + +start_persistent_local_opt_gens +------------------------------- +.. automodule:: start_persistent_local_opt_gens + :members: + :undoc-members: + +fast_alloc_and_pausing +---------------------- +.. automodule:: fast_alloc_and_pausing + :members: + :undoc-members: + +only_one_gen_alloc +------------------ +.. automodule:: only_one_gen_alloc + :members: + :undoc-members: + +start_fd_persistent +------------------- +.. automodule:: start_fd_persistent + :members: + :undoc-members: + +persistent_aposmm_alloc +----------------------- +.. automodule:: persistent_aposmm_alloc + :members: + :undoc-members: + +give_pregenerated_work +---------------------- +.. automodule:: give_pregenerated_work + :members: + :undoc-members: + +inverse_bayes_allocf +-------------------- +.. automodule:: inverse_bayes_allocf + :members: + :undoc-members: diff --git a/docs/function_guides/calc_status.rst b/docs/function_guides/calc_status.rst index fc1038a36f..93384bc2ae 100644 --- a/docs/function_guides/calc_status.rst +++ b/docs/function_guides/calc_status.rst @@ -19,81 +19,81 @@ user-specified string. They are the third optional return value from a user func Built-in codes are available in the ``libensemble.message_numbers`` module, but users are also free to return any custom string. -.. tab-set:: - - .. tab-item:: calc_status with :ref:`Executor` - - .. code-block:: python - :linenos: - :emphasize-lines: 4,16,19,22,30 - - from libensemble.message_numbers import WORKER_DONE, WORKER_KILL, TASK_FAILED - - task = exctr.submit(calc_type="sim", num_procs=cores, wait_on_start=True) - calc_status = UNSET_TAG - poll_interval = 1 # secs - while not task.finished: - if task.runtime > time_limit: - task.kill() # Timeout - else: - time.sleep(poll_interval) - task.poll() - - if task.finished: - if task.state == "FINISHED": - print("Task {} completed".format(task.name)) - calc_status = WORKER_DONE - elif task.state == "FAILED": - print("Warning: Task {} failed: Error code {}".format(task.name, task.errcode)) - calc_status = TASK_FAILED - elif task.state == "USER_KILLED": - print("Warning: Task {} has been killed".format(task.name)) - calc_status = WORKER_KILL - else: - print("Warning: Task {} in unknown state {}. Error code {}".format(task.name, task.state, task.errcode)) - - outspecs = sim_specs["out"] - output = np.zeros(1, dtype=outspecs) - output["energy"][0] = final_energy - - return output, persis_info, calc_status - - .. tab-item:: Custom calc_status - - .. code-block:: python - :linenos: - - from libensemble.message_numbers import WORKER_DONE, TASK_FAILED - - task = exctr.submit(calc_type="sim", num_procs=cores, wait_on_start=True) - - task.wait(timeout=60) - - file_output = read_task_output(task) - if task.errcode == 0: - if "fail" in file_output: - calc_status = "Task failed successfully?" - else: - calc_status = WORKER_DONE - else: - calc_status = TASK_FAILED - - outspecs = sim_specs["out"] - output = np.zeros(1, dtype=outspecs) - output["energy"][0] = final_energy - - return output, persis_info, calc_status - -.. tab-set:: - - .. tab-item:: Available values - - .. literalinclude:: ../../libensemble/message_numbers.py - :start-after: first_calc_status_rst_tag - :end-before: last_calc_status_rst_tag - - .. tab-item:: Corresponding messages - - .. literalinclude:: ../../libensemble/message_numbers.py - :start-at: calc_status_strings - :end-before: last_calc_status_string_rst_tag +calc_status with Executor +--------------------------- + +.. code-block:: python + :linenos: + :emphasize-lines: 4,16,19,22,30 + + from libensemble.message_numbers import WORKER_DONE, WORKER_KILL, TASK_FAILED + + task = exctr.submit(calc_type="sim", num_procs=cores, wait_on_start=True) + calc_status = UNSET_TAG + poll_interval = 1 # secs + while not task.finished: + if task.runtime > time_limit: + task.kill() # Timeout + else: + time.sleep(poll_interval) + task.poll() + + if task.finished: + if task.state == "FINISHED": + print("Task {} completed".format(task.name)) + calc_status = WORKER_DONE + elif task.state == "FAILED": + print("Warning: Task {} failed: Error code {}".format(task.name, task.errcode)) + calc_status = TASK_FAILED + elif task.state == "USER_KILLED": + print("Warning: Task {} has been killed".format(task.name)) + calc_status = WORKER_KILL + else: + print("Warning: Task {} in unknown state {}. Error code {}".format(task.name, task.state, task.errcode)) + + outspecs = sim_specs["out"] + output = np.zeros(1, dtype=outspecs) + output["energy"][0] = final_energy + + return output, persis_info, calc_status + +Custom calc_status +------------------ + +.. code-block:: python + :linenos: + + from libensemble.message_numbers import WORKER_DONE, TASK_FAILED + + task = exctr.submit(calc_type="sim", num_procs=cores, wait_on_start=True) + + task.wait(timeout=60) + + file_output = read_task_output(task) + if task.errcode == 0: + if "fail" in file_output: + calc_status = "Task failed successfully?" + else: + calc_status = WORKER_DONE + else: + calc_status = TASK_FAILED + + outspecs = sim_specs["out"] + output = np.zeros(1, dtype=outspecs) + output["energy"][0] = final_energy + + return output, persis_info, calc_status + +Available values +---------------- + +.. literalinclude:: ../../libensemble/message_numbers.py + :start-after: first_calc_status_rst_tag + :end-before: last_calc_status_rst_tag + +Corresponding messages +---------------------- + +.. literalinclude:: ../../libensemble/message_numbers.py + :start-at: calc_status_strings + :end-before: last_calc_status_string_rst_tag diff --git a/docs/function_guides/function_guide_index.rst b/docs/function_guides/function_guide_index.rst index 621bf36d27..916a6fdd50 100644 --- a/docs/function_guides/function_guide_index.rst +++ b/docs/function_guides/function_guide_index.rst @@ -1,28 +1,19 @@ -====================== -Writing User Functions -====================== +===================== +Writing Gens and Sims +===================== -User functions typically require only some familiarity with NumPy_, but if they conform to -the :ref:`user function APIs`, they can incorporate methods from machine-learning, -mathematics, resource management, or other libraries/applications. - -These guides describe common development patterns and optional components: +These guides describe common development patterns and optional components +for users writing generators and simulators for libEnsemble. .. toctree:: :maxdepth: 2 - :caption: Writing User Functions + :caption: Writing Gens and Sims generator simulator - allocator - sim_gen_alloc_api .. toctree:: :maxdepth: 2 :caption: Useful Data Structures calc_status - work_dict - worker_array - -.. _NumPy: http://www.numpy.org diff --git a/docs/function_guides/generator.rst b/docs/function_guides/generator.rst index ad0484fbad..c560ce3934 100644 --- a/docs/function_guides/generator.rst +++ b/docs/function_guides/generator.rst @@ -1,212 +1,26 @@ .. _funcguides-gen: -Generator Functions -=================== +Generators +========== -Generator and :ref:`Simulator functions` have relatively similar interfaces. +**Introduction** \|\| `Standardized Generator (gest-api) `__ \|\| `Legacy Generator Function `__ Writing a Generator ------------------- -.. code-block:: python - - def my_generator(Input, persis_info, gen_specs, libE_info): - batch_size = gen_specs["user"]["batch_size"] - - Output = np.zeros(batch_size, gen_specs["out"]) - # ... - Output["x"], persis_info = generate_next_simulation_inputs(Input["f"], persis_info) - - return Output, persis_info - -Most ``gen_f`` function definitions written by users resemble:: - - def my_generator(Input, persis_info, gen_specs, libE_info): - -where: - - * ``Input`` is a selection of the :ref:`History array`, a NumPy structured array. - * :ref:`persis_info` is a dictionary containing state information. - * :ref:`gen_specs` is a dictionary of generator parameters. - * ``libE_info`` is a dictionary containing miscellaneous entries. - -Valid generator functions can accept a subset of the above parameters. So a very simple generator can start:: - - def my_generator(Input): - -If ``gen_specs`` was initially defined: - -.. code-block:: python - - gen_specs = GenSpecs( - gen_f=my_generator, - inputs=["f"], - outputs=["x", float, (1,)], - user={"batch_size": 128}, - ) - -Then user parameters and a *local* array of outputs may be obtained/initialized like:: - - batch_size = gen_specs["user"]["batch_size"] - Output = np.zeros(batch_size, dtype=gen_specs["out"]) - -This array should be populated by whatever values are generated within -the function:: - - Output["x"], persis_info = generate_next_simulation_inputs(Input["f"], persis_info) - -Then return the array and ``persis_info`` to libEnsemble:: - - return Output, persis_info - -Between the ``Output`` definition and the ``return``, any computation can be performed. -Users can try an :doc:`executor<../executor/overview>` to submit applications to parallel -resources, or plug in components from other libraries to serve their needs. - .. note:: + The `gest-api` generator interface is the recommended approach for new libEnsemble projects. + The "Legacy Generator Function" interface is supported for backward compatibility but may be deprecated in a future release. - State ``gen_f`` information like checkpointing should be - appended to ``persis_info``. - -.. _persistent-gens: - -Persistent Generators ---------------------- - -While non-persistent generators return after completing their calculation, persistent -generators do the following in a loop: - - 1. Receive simulation results and metadata; exit if metadata instructs. - 2. Perform analysis. - 3. Send subsequent simulation parameters. - -Persistent generators don't need to be re-initialized on each call, but are typically -more complicated. The persistent :doc:`APOSMM<../examples/aposmm>` -optimization generator function included with libEnsemble maintains -local optimization subprocesses based on results from complete simulations. - -Use ``GenSpecs.persis_in`` to specify fields to send back to the generator throughout the run. -``GenSpecs.inputs`` only describes the input fields when the function is **first called**. - -Functions for a persistent generator to communicate directly with the manager -are available in the :ref:`libensemble.tools.persistent_support` class. - -Sending/receiving data is supported by the :ref:`PersistentSupport` class:: - - from libensemble.tools import PersistentSupport - from libensemble.message_numbers import STOP_TAG, PERSIS_STOP, EVAL_GEN_TAG, FINISHED_PERSISTENT_GEN_TAG - - my_support = PersistentSupport(libE_info, EVAL_GEN_TAG) - -Implementing functions from the above class is relatively simple: - -.. tab-set:: - - .. tab-item:: send - - .. currentmodule:: libensemble.tools.persistent_support.PersistentSupport - .. autofunction:: send - - This function call typically resembles:: - - my_support.send(local_H_out[selected_IDs]) - - Note that this function has no return. - - .. tab-item:: recv - - .. currentmodule:: libensemble.tools.persistent_support.PersistentSupport - .. autofunction:: recv - - This function call typically resembles:: - - tag, Work, calc_in = my_support.recv() - - if tag in [STOP_TAG, PERSIS_STOP]: - cleanup() - break - - The logic following the function call is typically used to break the persistent - generator's main loop and return. - - .. tab-item:: send_recv - - .. currentmodule:: libensemble.tools.persistent_support.PersistentSupport - .. autofunction:: send_recv - - This function performs both of the previous functions in a single statement. Its - usage typically resembles:: - - tag, Work, calc_in = my_support.send_recv(local_H_out[selected_IDs]) - if tag in [STOP_TAG, PERSIS_STOP]: - cleanup() - break - - Once the persistent generator's loop has been broken because of - the tag from the manager, it should return with an additional tag:: - - return local_H_out, persis_info, FINISHED_PERSISTENT_GEN_TAG - -See :ref:`calc_status` for more information about -the message tags. - -.. _gen_active_recv: - -Active receive mode -------------------- - -By default, a persistent worker is expected to -receive and send data in a *ping pong* fashion. Alternatively, -a worker can be initiated in *active receive* mode by the allocation -function (see :ref:`start_only_persistent`). -The persistent worker can then send and receive from the manager at any time. - -Ensure there are no communication deadlocks in this mode. In manager-worker message exchanges, only the worker-side -receive is blocking by default (a non-blocking option is available). - -Cancelling Simulations ----------------------- - -Previously submitted simulations can be cancelled by sending a message to the manager: - -.. currentmodule:: libensemble.tools.persistent_support.PersistentSupport -.. autofunction:: request_cancel_sim_ids - -- If a generated point is cancelled by the generator **before sending** to another worker for simulation, then it won't be sent. -- If that point has **already been evaluated** by a simulation, the ``cancel_requested`` field will remain ``True``. -- If that point is **currently being evaluated**, a kill signal will be sent to the corresponding worker; it must be manually processed in the simulation function. - -The :doc:`Borehole Calibration tutorial<../tutorials/calib_cancel_tutorial>` gives an example -of the capability to cancel pending simulations. - -Modification of existing points -------------------------------- - -To change existing fields of the History array, create a NumPy structured array where the ``dtype`` contains -the ``sim_id`` and the fields to be modified. Send this array with ``keep_state=True`` to the manager. -This will overwrite the manager's History array. - -For example, the cancellation function ``request_cancel_sim_ids`` could be replicated by -the following (where ``sim_ids_to_cancel`` is a list of integers): - -.. code-block:: python - - # Send only these fields to existing H rows and libEnsemble will slot in the change. - H_o = np.zeros(len(sim_ids_to_cancel), dtype=[("sim_id", int), ("cancel_requested", bool)]) - H_o["sim_id"] = sim_ids_to_cancel - H_o["cancel_requested"] = True - ps.send(H_o, keep_state=True) - -Generator initiated shutdown ----------------------------- +Tutorial sections +----------------- -If using a supporting allocation function, the generator can prompt the ensemble to shutdown -by simply exiting the function (e.g., on a test for a converged value). For example, the -allocation function :ref:`start_only_persistent` closes down -the ensemble as soon as a persistent generator returns. The usual return values should be given. +1. Introduction (this page) +2. :doc:`Standardized Generator (gest-api) ` +3. :doc:`Legacy Generator Function ` -Examples --------- +.. toctree:: + :hidden: -Examples of non-persistent and persistent generator functions -can be found :doc:`here<../examples/gen_funcs>`. + generator_standardized + generator_legacy diff --git a/docs/function_guides/generator_legacy.rst b/docs/function_guides/generator_legacy.rst new file mode 100644 index 0000000000..c8c155a363 --- /dev/null +++ b/docs/function_guides/generator_legacy.rst @@ -0,0 +1,202 @@ +Legacy Generator Function +========================= + +`Introduction `__ \|\| `Standardized Generator (gest-api) `__ \|\| **Legacy Generator Function** + +.. code-block:: python + + def my_generator(Input, persis_info, gen_specs, libE_info): + batch_size = gen_specs["user"]["batch_size"] + + Output = np.zeros(batch_size, gen_specs["out"]) + # ... + Output["x"], persis_info = generate_next_simulation_inputs(Input["f"], persis_info) + + return Output, persis_info + +Most ``gen_f`` function definitions written by users resemble:: + + def my_generator(Input, persis_info, gen_specs, libE_info): + +where: + + * ``Input`` is a selection of the :ref:`History array`, a NumPy structured array. + * :ref:`persis_info` is a dictionary containing state information. + * :ref:`gen_specs` is a dictionary of generator parameters. + * ``libE_info`` is a dictionary containing miscellaneous entries. + +Valid generator functions can accept a subset of the above parameters. So a very simple generator can start:: + + def my_generator(Input): + +If ``gen_specs`` was initially defined: + +.. code-block:: python + + gen_specs = GenSpecs( + gen_f=my_generator, + inputs=["f"], + outputs=["x", float, (1,)], + user={"batch_size": 128}, + ) + +Then user parameters and a *local* array of outputs may be obtained/initialized like:: + + batch_size = gen_specs["user"]["batch_size"] + Output = np.zeros(batch_size, dtype=gen_specs["out"]) + +This array should be populated by whatever values are generated within +the function:: + + Output["x"], persis_info = generate_next_simulation_inputs(Input["f"], persis_info) + +Then return the array and ``persis_info`` to libEnsemble:: + + return Output, persis_info + +Between the ``Output`` definition and the ``return``, any computation can be performed. +Users can try an :doc:`executor<../executor/ex_index>` to submit applications to parallel +resources, or plug in components from other libraries to serve their needs. + +.. note:: + + State ``gen_f`` information like checkpointing should be + appended to ``persis_info``. + +.. _persistent-gens: + +**Persistent Generators** + +While non-persistent generators return after completing their calculation, persistent +generators do the following in a loop: + + 1. Receive simulation results and metadata; exit if metadata instructs. + 2. Perform analysis. + 3. Send subsequent simulation parameters. + +Persistent generators don't need to be re-initialized on each call, but are typically +more complicated. The persistent :doc:`APOSMM<../examples/aposmm>` +optimization generator function included with libEnsemble maintains +local optimization subprocesses based on results from complete simulations. + +Use ``GenSpecs.persis_in`` to specify fields to send back to the generator throughout the run. +``GenSpecs.inputs`` only describes the input fields when the function is **first called**. + +Functions for a persistent generator to communicate directly with the manager +are available in the :ref:`libensemble.tools.persistent_support` class. + +Sending/receiving data is supported by the :ref:`PersistentSupport` class:: + + from libensemble.tools import PersistentSupport + from libensemble.message_numbers import STOP_TAG, PERSIS_STOP, EVAL_GEN_TAG, FINISHED_PERSISTENT_GEN_TAG + + my_support = PersistentSupport(libE_info, EVAL_GEN_TAG) + +Implementing functions from the above class is relatively simple: + +send +^^^^ + +.. currentmodule:: libensemble.tools.persistent_support.PersistentSupport +.. autofunction:: send + +This function call typically resembles:: + + my_support.send(local_H_out[selected_IDs]) + +Note that this function has no return. + +recv +^^^^ + +.. currentmodule:: libensemble.tools.persistent_support.PersistentSupport +.. autofunction:: recv + +This function call typically resembles:: + + tag, Work, calc_in = my_support.recv() + + if tag in [STOP_TAG, PERSIS_STOP]: + cleanup() + break + +The logic following the function call is typically used to break the persistent +generator's main loop and return. + +send_recv +^^^^^^^^^ + +.. currentmodule:: libensemble.tools.persistent_support.PersistentSupport +.. autofunction:: send_recv + +This function performs both of the previous functions in a single statement. Its +usage typically resembles:: + + tag, Work, calc_in = my_support.send_recv(local_H_out[selected_IDs]) + if tag in [STOP_TAG, PERSIS_STOP]: + cleanup() + break + +Once the persistent generator's loop has been broken because of +the tag from the manager, it should return with an additional tag:: + + return local_H_out, persis_info, FINISHED_PERSISTENT_GEN_TAG + +See :ref:`calc_status` for more information about +the message tags. + +.. _gen_active_recv: + +**Active receive mode** + +By default, a persistent worker is expected to +receive and send data in a *ping pong* fashion. Alternatively, +a worker can be initiated in *active receive* mode by the allocation +function (see :ref:`start_only_persistent`). +The persistent worker can then send and receive from the manager at any time. + +Ensure there are no communication deadlocks in this mode. In manager-worker message exchanges, only the worker-side +receive is blocking by default (a non-blocking option is available). + +**Cancelling Simulations** + +Previously submitted simulations can be cancelled by sending a message to the manager: + +.. currentmodule:: libensemble.tools.persistent_support.PersistentSupport +.. autofunction:: request_cancel_sim_ids + +- If a generated point is cancelled by the generator **before sending** to another worker for simulation, then it won't be sent. +- If that point has **already been evaluated** by a simulation, the ``cancel_requested`` field will remain ``True``. +- If that point is **currently being evaluated**, a kill signal will be sent to the corresponding worker; it must be manually processed in the simulation function. + +The :doc:`Borehole Calibration tutorial<../tutorials/calib_cancel_tutorial>` gives an example +of the capability to cancel pending simulations. + +**Modification of existing points** + +To change existing fields of the History array, create a NumPy structured array where the ``dtype`` contains +the ``sim_id`` and the fields to be modified. Send this array with ``keep_state=True`` to the manager. +This will overwrite the manager's History array. + +For example, the cancellation function ``request_cancel_sim_ids`` could be replicated by +the following (where ``sim_ids_to_cancel`` is a list of integers): + +.. code-block:: python + + # Send only these fields to existing H rows and libEnsemble will slot in the change. + H_o = np.zeros(len(sim_ids_to_cancel), dtype=[("sim_id", int), ("cancel_requested", bool)]) + H_o["sim_id"] = sim_ids_to_cancel + H_o["cancel_requested"] = True + ps.send(H_o, keep_state=True) + +**Generator initiated shutdown** + +If using a supporting allocation function, the generator can prompt the ensemble to shutdown +by simply exiting the function (e.g., on a test for a converged value). For example, the +allocation function :ref:`start_only_persistent` closes down +the ensemble as soon as a persistent generator returns. The usual return values should be given. + +**Examples** + +Examples of non-persistent and persistent generator functions +can be found :doc:`here<../examples/gen_funcs>`. diff --git a/docs/function_guides/generator_standardized.rst b/docs/function_guides/generator_standardized.rst new file mode 100644 index 0000000000..d02c0619f7 --- /dev/null +++ b/docs/function_guides/generator_standardized.rst @@ -0,0 +1,60 @@ +Standardized Generator (gest-api) +================================= + +`Introduction `__ \|\| **Standardized Generator (gest-api)** \|\| `Legacy Generator Function `__ + +Standardized generators are classes that inherit from ``gest_api.Generator``. +They adhere to the ``gest-api`` standard and are parameterized by a ``VOCS`` +object defining the problem's variables and objectives. + +A basic generator implements the ``suggest()`` and ``ingest()`` methods, which +operate on lists of dictionaries: + +.. code-block:: python + :linenos: + + import numpy as np + from gest_api import Generator + from gest_api.vocs import VOCS + + + class UniformSample(Generator): + """Samples over the domain specified in the VOCS.""" + + def __init__(self, vocs: VOCS): + self.vocs = vocs + self.rng = np.random.default_rng(1) + super().__init__(vocs) + + def _validate_vocs(self, vocs): + assert len(self.vocs.variable_names), "VOCS must contain variables." + + def suggest(self, n_trials): + output = [] + for _ in range(n_trials): + trial = {} + for key in self.vocs.variables: + trial[key] = self.rng.uniform(self.vocs.variables[key].domain[0], self.vocs.variables[key].domain[1]) + output.append(trial) + return output + + def ingest(self, calc_in): + pass # random sample so nothing to ingest + +libEnsemble's handling of standardized generators is specified using ``GenSpecs``: + +.. code-block:: python + + gen_specs = GenSpecs( + generator=UniformSample(vocs), + inputs=["sim_id"], + persis_in=["x", "f"], + outputs=[("x", float, 2)], + vocs=vocs, + user={"batch_size": 128}, + ) + +.. note:: + Ensure that ``gen_specs.inputs`` or ``gen_specs.persis_in`` requests at least one field + (like ``"sim_id"`` or ``"f"``) to be sent back, even if the generator does not + process them. diff --git a/docs/function_guides/history_array.rst b/docs/function_guides/history_array.rst index 6820b6faec..03ded946d9 100644 --- a/docs/function_guides/history_array.rst +++ b/docs/function_guides/history_array.rst @@ -15,25 +15,25 @@ libEnsemble uses a NumPy structured array to store information about each point The manager maintains a global copy. Each row contains: - 1. Data generated by the :ref:`gen_f` - 2. Resultant output from the :ref:`sim_f` + 1. Data generated by the :ref:`generator` + 2. Resultant output from the :ref:`simulator function` 3. :ref:`Reserved fields` containing metadata -When the history array is initialized, it creates fields for each -``gen_specs["out"]`` and ``sim_specs["out"]`` entry. These entries may resemble:: +**Simulator functions** (``sim_f``) must return their data as arrays with the same +dtype as ``sim_specs["out"]``. Alternatively, a ``simulator`` +callable in gest-api format (accepting and returning a ``dict``) can be provided via +``SimSpecs.simulator``; libEnsemble wraps it automatically and handles the dtype +conversion. - gen_specs["out"] = [("x", float, 2), ("theta", int)] - sim_specs["out"] = [("f", float)] +**Generators** that adhere to the ``gest_api`` standard implement ``suggest()`` and +``ingest()`` methods that operate on lists of Python dictionaries. libEnsemble +automatically casts their ``dict`` outputs to NumPy for inclusion in the History array. -.. In this example, ``x`` is a two-dimensional coordinate, ``theta`` represents some -.. integer input parameter, and ``f`` is a scalar output of the simulation to be -.. run with the generated ``x`` and ``theta`` values. - -Therefore, the ``gen_f`` and ``sim_f`` must return output as NumPy -structured arrays for slotting into these fields. - -.. (The manager's history array will update any fields -.. returned to it.) +When using a ``VOCS`` object (from ``gest_api.vocs``) to parameterize ``GenSpecs`` or +``SimSpecs``, field names in the History array are derived automatically from the VOCS +variable, objective, and constraint keys. ``LibensembleGenerator`` subclasses optionally +collapse all VOCS variables into a single ``"x"`` array field (and objectives into +``"f"``) unless an explicit ``variables_mapping`` is provided. Ensure input/output field names for a function match each other or a :ref:`reserved field`:: @@ -48,45 +48,12 @@ Reserved Fields User fields and reserved fields are combined together in the final History array returned by libEnsemble. -.. Automatically tracked fields within the History array include: - -.. 1. ``sim_id``, to globally identify the point. Assigned by manager if the generator doesn't provide. -.. 2. ``cancel_requested``, - -.. The manager's history array also contains several reserved fields. These -.. include a ``sim_id`` to globally identify the point (on the manager this is -.. usually the same as the array index). The ``sim_id`` can be provided by the -.. user from the ``gen_f``, but is otherwise assigned by the manager as generated -.. points are received. - -.. The reserved boolean field ``cancel_requested`` can also be set in a user -.. function to request that libEnsemble cancels the evaluation of the point. - -.. The remaining reserved fields are protected (populated by libEnsemble), and -.. store information about each entry. These include boolean fields for the -.. current scheduling status of the point (``sim_started`` when the sim evaluation -.. has started out, ``sim_ended`` when sim evaluation has completed, and -.. ``gen_informed`` when the sim output has been passed back to the generator). -.. Timing fields give the time (since the epoch) corresponding to each state, and -.. when the point was generated. Other protected fields include the worker IDs on -.. which points were generated or evaluated. - -.. The user fields and the reserved fields together make up the final history array -.. returned by libEnsemble. - These reserved fields can be modified to adjust how/when a point is evaluated: * ``sim_id`` [int]: Each unit of work must have a ``sim_id``. This can be set by the generator or by the manager by default. Users should ensure these IDs are sequential and unique when running multiple generators. -.. * The generator can assign this, but users must be -.. careful to ensure that points are added in order. For example, if ``alloc_f`` -.. allows for two ``gen_f`` instances to be running simultaneously, ``alloc_f`` -.. should ensure that both don't generate points with the same ``sim_id``. -.. If the generator does not provide, then a ``sim_id`` will be assigned by the -.. manager as generated points are received. - * ``cancel_requested`` [bool]: Can be set ``True`` in a generator to request attempted cancellation of the corresponding simulation. @@ -114,11 +81,9 @@ The following fields are automatically populated by libEnsemble: ``kill_sent`` [bool]: ``True`` if a kill signal was sent to worker for this entry -Other than ``"sim_id"`` and ``cancel_requested``, these fields cannot be -overwritten by user functions unless ``libE_specs["safe_mode"]`` is set to ``False``. - -.. warning:: - Adjusting values in protected fields may crash libEnsemble. +Other than ``"sim_id"`` and ``"cancel_requested"``, these fields cannot be +overwritten by user functions when ``libE_specs["safe_mode"]`` is set to ``True`` +(protection is opt-in; the default value of ``safe_mode`` is ``False``). Example Workflow updating History --------------------------------- @@ -136,10 +101,13 @@ reserved fields: ``sim_id``, ``sim_started``, and ``sim_ended`` are shown for br | -:ref:`gen_f` and :ref:`sim_f` functions accept a local history -array as the first argument that contains only the rows and fields specified. -For new function calls these will be specified by either ``gen_specs["in"]`` or -``sim_specs["in"]``. For generators this may be empty. +For legacy generator functions (``gen_f``), the function accepts a local history +array slice as the first argument containing only the rows and fields specified by +``gen_specs["in"]`` (may be empty). It returns a NumPy structured array that +libEnsemble writes into H. + +For gest-api generators, ``suggest(n)`` returns a list of dicts and ``ingest(results)`` +receives a list of dicts; libEnsemble handles all conversions to and from NumPy. | diff --git a/docs/function_guides/sim_gen_alloc_api.rst b/docs/function_guides/sim_gen_alloc_api.rst deleted file mode 100644 index 546806edef..0000000000 --- a/docs/function_guides/sim_gen_alloc_api.rst +++ /dev/null @@ -1,140 +0,0 @@ -User Function API ------------------ -.. _user_api: - -libEnsemble requires functions for generation, simulation, and allocation. - -While libEnsemble provides a default allocation function, the simulator and generator functions -must be specified. The required API and example arguments are given here. -:doc:`Example sim and gen functions<../examples/examples_index>` are provided in the -libEnsemble package. - -:doc:`See here for more in-depth guides to writing user functions` - -As of v0.10.0, valid simulator and generator functions -can *accept and return a smaller subset of the listed parameters and return values*. For instance, -a ``def my_simulation(one_Input) -> one_Output`` function is now accepted, -as is ``def my_generator(Input, persis_info) -> Output, persis_info``. - -sim_f API -~~~~~~~~~ -.. _api_sim_f: - -The simulator function will be called by libEnsemble's workers with *up to* the following arguments and returns:: - - Out, persis_info, calc_status = sim_f(H[sim_specs["in"]][sim_ids_from_allocf], persis_info, sim_specs, libE_info) - -Parameters: -*********** - - **H**: ``numpy structured array`` - :ref:`(example)` - - **persis_info**: :obj:`dict` - :ref:`(example)` - - **sim_specs**: :obj:`dict` - :ref:`(example)` - - **libE_info**: :obj:`dict` - :ref:`(example)` - -Returns: -******** - - **H**: ``numpy structured array`` - with keys/value-sizes matching those in sim_specs["out"] - :ref:`(example)` - - **persis_info**: :obj:`dict` - :ref:`(example)` - - **calc_status**: :obj:`int`, optional - Provides a task status to the manager and the libE_stats.txt file - :ref:`(example)` - -gen_f API -~~~~~~~~~ -.. _api_gen_f: - -The generator function will be called by libEnsemble's workers with *up to* the following arguments and returns:: - - Out, persis_info, calc_status = gen_f(H[gen_specs["in"]][sim_ids_from_allocf], persis_info, gen_specs, libE_info) - -Parameters: -*********** - - **H**: ``numpy structured array`` - :ref:`(example)` - - **persis_info**: :obj:`dict` - :ref:`(example)` - - **gen_specs**: :obj:`dict` - :ref:`(example)` - - **libE_info**: :obj:`dict` - :ref:`(example)` - -Returns: -******** - - **H**: ``numpy structured array`` - with keys/value-sizes matching those in gen_specs["out"] - :ref:`(example)` - - **persis_info**: :obj:`dict` - :ref:`(example)` - - **calc_status**: :obj:`int`, optional - Provides a task status to the manager and the libE_stats.txt file - :ref:`(example)` - -alloc_f API -~~~~~~~~~~~ -.. _api_alloc_f: - -The allocation function will be called by libEnsemble's manager with the following API:: - - Work, persis_info, stop_flag = alloc_f(W, H, sim_specs, gen_specs, alloc_specs, persis_info, libE_info) - -Parameters: -*********** - - **W**: ``numpy structured array`` - :doc:`(example)` - - **H**: ``numpy structured array`` - :ref:`(example)` - - **sim_specs**: :obj:`dict` - :ref:`(example)` - - **gen_specs**: :obj:`dict` - :ref:`(example)` - - **alloc_specs**: :obj:`dict` - :ref:`(example)` - - **persis_info**: :obj:`dict` - :ref:`(example)` - - **libE_info**: :obj:`dict` - Various statistics useful to the allocation function for determining how much - work has been evaluated, or if the routine should prepare to complete. See - the :doc:`allocation function guide` for more - information. - -Returns: -******** - - **Work**: :obj:`dict` - Dictionary with integer keys ``i`` for work to be sent to worker ``i``. - :ref:`(example)` - - **persis_info**: :obj:`dict` - :doc:`(example)<../data_structures/persis_info>` - - **stop_flag**: :obj:`int`, optional - Set to 1 to request libEnsemble manager to stop giving additional work after - receiving existing work diff --git a/docs/function_guides/simulator.rst b/docs/function_guides/simulator.rst index 46c625488d..5d69a4f79b 100644 --- a/docs/function_guides/simulator.rst +++ b/docs/function_guides/simulator.rst @@ -3,71 +3,30 @@ Simulator Functions =================== +**Introduction** \|\| `Standardized Simulator (gest-api) `__ \|\| `Legacy Simulator Function `__ + Simulator and :ref:`Generator functions` have relatively similar interfaces. Writing a Simulator ------------------- -.. code-block:: python - - def my_simulation(Input, persis_info, sim_specs, libE_info): - batch_size = sim_specs["user"]["batch_size"] - - Output = np.zeros(batch_size, sim_specs["out"]) - # ... - Output["f"], persis_info = do_a_simulation(Input["x"], persis_info) - - return Output, persis_info - -Most ``sim_f`` function definitions written by users resemble:: - - def my_simulation(Input, persis_info, sim_specs, libE_info): - -where: - - * ``Input`` is a selection of the :ref:`History array`, a NumPy structured array. - * :ref:`persis_info` is a dictionary containing state information. - * :ref:`sim_specs` is a dictionary of simulation parameters. - * ``libE_info`` is a dictionary containing libEnsemble-specific entries. - -Valid simulator functions can accept a subset of the above parameters. So a very simple simulator function can start:: - - def my_simulation(Input): - -If ``sim_specs`` was initially defined: - -.. code-block:: python - - sim_specs = SimSpecs( - sim_f=my_simulation, - inputs=["x"], - outputs=["f", float, (1,)], - user={"batch_size": 128}, - ) - -Then user parameters and a *local* array of outputs may be obtained/initialized like:: - - batch_size = sim_specs["user"]["batch_size"] - Output = np.zeros(batch_size, dtype=sim_specs["out"]) - -This array should be populated with output values from the simulation:: - - Output["f"], persis_info = do_a_simulation(Input["x"], persis_info) - -Then return the array and ``persis_info`` to libEnsemble:: +.. note:: + The `gest-api` simulator interface is the recommended approach for new libEnsemble projects. + The "Legacy Simulator Function" interface is supported for backward compatibility but may be deprecated in a future release. - return Output, persis_info +Tutorial sections +----------------- -Between the ``Output`` definition and the ``return``, any computation can be performed. -Users can try an :doc:`executor<../executor/overview>` to submit applications to parallel -resources, or plug in components from other libraries to serve their needs. +1. Introduction (this page) +2. :doc:`Standardized Simulator (gest-api) ` +3. :doc:`Legacy Simulator Function ` Executor -------- libEnsemble's Executors are commonly used within simulator functions to launch and monitor applications. An excellent overview is already available -:doc:`here<../executor/overview>`. +:doc:`here<../executor/ex_index>`. See the :doc:`Ensemble with an MPI Application tutorial<../tutorials/executor_forces_tutorial>` for an additional example to try out. @@ -86,3 +45,9 @@ function returns. An example routine using a persistent simulator can be found in test_persistent_sim_uniform_sampling_. .. _test_persistent_sim_uniform_sampling: https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_persistent_sim_uniform_sampling.py + +.. toctree:: + :hidden: + + simulator_standardized + simulator_legacy diff --git a/docs/function_guides/simulator_legacy.rst b/docs/function_guides/simulator_legacy.rst new file mode 100644 index 0000000000..3f65096abc --- /dev/null +++ b/docs/function_guides/simulator_legacy.rst @@ -0,0 +1,58 @@ +Legacy Simulator Function +========================= + +`Introduction `__ \|\| `Standardized Simulator (gest-api) `__ \|\| **Legacy Simulator Function** + +.. code-block:: python + + def my_simulation(Input, persis_info, sim_specs, libE_info): + batch_size = sim_specs["user"]["batch_size"] + + Output = np.zeros(batch_size, sim_specs["out"]) + # ... + Output["f"], persis_info = do_a_simulation(Input["x"], persis_info) + + return Output, persis_info + +Most ``sim_f`` function definitions written by users resemble:: + + def my_simulation(Input, persis_info, sim_specs, libE_info): + +where: + + * ``Input`` is a selection of the :ref:`History array`, a NumPy structured array. + * :ref:`persis_info` is a dictionary containing state information. + * :ref:`sim_specs` is a dictionary of simulation parameters. + * ``libE_info`` is a dictionary containing libEnsemble-specific entries. + +Valid simulator functions can accept a subset of the above parameters. So a very simple simulator function can start:: + + def my_simulation(Input): + +If ``sim_specs`` was initially defined: + +.. code-block:: python + + sim_specs = SimSpecs( + sim_f=my_simulation, + inputs=["x"], + outputs=["f", float, (1,)], + user={"batch_size": 128}, + ) + +Then user parameters and a *local* array of outputs may be obtained/initialized like:: + + batch_size = sim_specs["user"]["batch_size"] + Output = np.zeros(batch_size, dtype=sim_specs["out"]) + +This array should be populated with output values from the simulation:: + + Output["f"], persis_info = do_a_simulation(Input["x"], persis_info) + +Then return the array and ``persis_info`` to libEnsemble:: + + return Output, persis_info + +Between the ``Output`` definition and the ``return``, any computation can be performed. +Users can try an :doc:`executor<../executor/ex_index>` to submit applications to parallel +resources, or plug in components from other libraries to serve their needs. diff --git a/docs/function_guides/simulator_standardized.rst b/docs/function_guides/simulator_standardized.rst new file mode 100644 index 0000000000..27b72deb5f --- /dev/null +++ b/docs/function_guides/simulator_standardized.rst @@ -0,0 +1,43 @@ +Standardized Simulator (gest-api) +================================= + +`Introduction `__ \|\| **Standardized Simulator (gest-api)** \|\| `Legacy Simulator Function `__ + +Standardized simulators are plain callables — no base class required — with the signature:: + + def my_simulation(input_dict: dict, **kwargs) -> dict: + +They receive a single point as a Python dictionary (keyed by VOCS variable and constant +names) and return a dictionary of outputs (keyed by VOCS objective, observable, and +constraint names). + +.. code-block:: python + + def my_simulation(input_dict: dict, **kwargs) -> dict: + x1 = input_dict["x1"] + x2 = input_dict["x2"] + f = (x1 - 1) ** 2 + (x2 - 2) ** 2 + return {"f": f} + +Configure it with ``SimSpecs`` using a ``VOCS`` object. ``inputs`` and ``outputs`` +are derived automatically from the VOCS when not set explicitly: + +.. code-block:: python + + from gest_api.vocs import VOCS + from libensemble.specs import SimSpecs + + vocs = VOCS( + variables={"x1": [0, 1.0], "x2": [0, 10.0]}, + objectives={"f": "MINIMIZE"}, + ) + + sim_specs = SimSpecs( + simulator=my_simulation, + vocs=vocs, + ) + +If ``libE_info`` is needed (e.g., to access the :doc:`executor<../executor/ex_index>`), +declare it as a keyword argument and libEnsemble will pass it automatically:: + + def my_simulation(input_dict: dict, libE_info=None, **kwargs) -> dict: diff --git a/docs/index.rst b/docs/index.rst index 2a2c40075e..49a06cdc6c 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -10,7 +10,7 @@ :caption: User Guide: Quickstart - advanced_installation + advanced_installation/advanced_installation overview_usecases programming_libE running_libE @@ -20,7 +20,7 @@ :maxdepth: 1 :caption: Tutorials: - tutorials/local_sine_tutorial + tutorials/local_sine_tutorial/local_sine_tutorial tutorials/executor_forces_tutorial tutorials/forces_gpu_tutorial tutorials/gpcam_tutorial @@ -35,7 +35,6 @@ examples/gest_api examples/gen_funcs examples/sim_funcs - examples/alloc_funcs examples/calling_scripts Submission Scripts @@ -43,11 +42,13 @@ :maxdepth: 1 :caption: Additional References: + function_guides/history_array + resource_manager/resources_index + function_guides/allocator FAQ known_issues release_notes contributing - posters .. toctree:: :maxdepth: 1 @@ -55,3 +56,4 @@ dev_guide/release_management/release_index dev_guide/dev_API/developer_API + bibliography diff --git a/docs/introduction.rst b/docs/introduction.rst index 4b36943398..87ccac72f6 100644 --- a/docs/introduction.rst +++ b/docs/introduction.rst @@ -1,7 +1,7 @@ .. include:: ../README.rst :start-after: after_badges_rst_tag -See the :doc:`tutorial` for a step-by-step beginners guide. +See the :doc:`tutorial` for a step-by-step beginners guide. See the `user guide`_ for more information. diff --git a/docs/introduction_latex.rst b/docs/introduction_latex.rst index e7750bac5f..512282dbfe 100644 --- a/docs/introduction_latex.rst +++ b/docs/introduction_latex.rst @@ -39,7 +39,6 @@ .. _pytest-timeout: https://pypi.org/project/pytest-timeout/ .. _pytest: https://pypi.org/project/pytest/ .. _Python: http://www.python.org -.. _pyyaml: https://pyyaml.org/ .. _Quickstart: https://libensemble.readthedocs.io/en/main/introduction.html .. _ReadtheDocs: http://libensemble.readthedocs.org/ .. _SciPy: http://www.scipy.org @@ -51,7 +50,6 @@ .. _SWIG: http://swig.org/ .. _tarball: https://github.com/Libensemble/libensemble/releases/latest .. _Tasmanian: https://github.com/ORNL/Tasmanian -.. _tomli: https://pypi.org/project/tomli/ .. _tqdm: https://tqdm.github.io/ .. _user guide: https://libensemble.readthedocs.io/en/latest/programming_libE.html .. _VTMOP: https://github.com/Libensemble/libe-community-examples#vtmop diff --git a/docs/known_issues.rst b/docs/known_issues.rst index 89c596aae5..a68f1bcf47 100644 --- a/docs/known_issues.rst +++ b/docs/known_issues.rst @@ -19,8 +19,6 @@ may occur when using libEnsemble. * Local comms mode (multiprocessing) may fail if MPI is initialized before forking processors. This is thought to be responsible for issues combining multiprocessing with PETSc on some platforms. -* Remote detection of logical cores via ``LSB_HOSTS`` (e.g., Summit) returns the - number of physical cores as SMT info not available. * TCP mode does not support (1) more than one libEnsemble call in a given script or (2) the auto-resources option to the Executor. diff --git a/docs/latex_index.rst b/docs/latex_index.rst index 556a421a24..e2fd0ffb90 100644 --- a/docs/latex_index.rst +++ b/docs/latex_index.rst @@ -34,7 +34,7 @@ other libEnsemble information. .. toctree:: :maxdepth: 3 - advanced_installation + advanced_installation/advanced_installation tutorials/tutorials FAQ known_issues diff --git a/docs/nitpicky b/docs/nitpicky index e43a0760bb..5f46003f50 100644 --- a/docs/nitpicky +++ b/docs/nitpicky @@ -47,7 +47,6 @@ py:class libensemble.resources.platforms.Perlmutter py:class libensemble.resources.platforms.PerlmutterCPU py:class libensemble.resources.platforms.PerlmutterGPU py:class libensemble.resources.platforms.Polaris -py:class libensemble.resources.platforms.Summit py:class libensemble.resources.rset_resources.RSetResources py:class libensemble.resources.env_resources.EnvResources py:class libensemble.resources.resources.Resources @@ -57,3 +56,15 @@ py:meth libensemble.tools.save_libE_output # Types specifying objects that can dramatically vary py:class comm py:class communicator + +# Additional nitpicky targets from recent Sphinx warnings +py:class libensemble.resources.platforms.Lumi +py:class libensemble.resources.platforms.LumiGPU +py:class numpy._typing._array_like._ScalarT +py:class Comm +py:class npt.DTypeLike +py:class libensemble.generators.PersistentGenInterfacer +py:class gest_api.vocs.VOCS +py:class libensemble.generators.LibensembleGenerator +py:class ~_ScalarT +py:class numpy.random._generator.Generator diff --git a/docs/overview_usecases.rst b/docs/overview_usecases.rst index 5467bab3eb..04ebb5e14f 100644 --- a/docs/overview_usecases.rst +++ b/docs/overview_usecases.rst @@ -5,11 +5,11 @@ Manager, Workers, Generators, and Simulators ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. begin_overview_rst_tag -libEnsemble's **manager** allocates work to **workers**, -which perform computations via **generators** and **simulators**: +libEnsemble's **manager** allocates work from **generators** to **workers**, +which perform computations via **simulators**: -* :ref:`generator`: Generates inputs for the *simulator* -* :ref:`simulator`: Performs an evaluation using parameters from the *generator* +* :ref:`generator`: Generates inputs for the *simulator* +* :ref:`simulator`: Performs an evaluation using parameters from the *generator* .. figure:: images/adaptiveloop.png :alt: Adaptive loops @@ -18,87 +18,57 @@ which perform computations via **generators** and **simulators**: | -.. figure:: images/diagram_with_persis.png - :alt: libE component diagram - :align: center - :scale: 40 - -| - -An :doc:`executor` interface is available so generators and simulators +An :doc:`executor` interface is available so generators and simulators can launch and monitor external applications. -libEnsemble uses a NumPy structured array called the :ref:`history array` -to record all simulations and generated values. - -Allocator Function -~~~~~~~~~~~~~~~~~~ - -* :ref:`allocator`: Decides whether a simulator or generator should be - invoked (and with what inputs/resources) as workers become available - -The default allocator (``alloc_f``) prompts workers to run the highest-priority simulator work. -If a worker is idle and no simulator work is available, that worker is prompted to query the generator. - -The default allocator is appropriate for the majority of use cases but can be customized -for users interested in more advanced allocation strategies. +All simulations and generated values are recorded in a NumPy +structured array called the :ref:`history array`. Example Use Cases ~~~~~~~~~~~~~~~~~ .. begin_usecases_rst_tag -Below are some expected libEnsemble use cases that we support (or are working -to support): - .. dropdown:: **Click Here for Use-Cases** * A user wants to optimize a simulation calculation. The simulation may already be using parallel resources but not a large fraction of a computer. libEnsemble can coordinate concurrent evaluations of the - simulation ``sim_f`` at multiple parameter values based on candidate parameter - values produced by ``gen_f`` (possibly after each ``sim_f`` output). + simulator at multiple parameter values based on candidate parameter + values produced by the generator (possibly after each simulator output). - * A user has a ``gen_f`` that produces meshes for a - ``sim_f``. Based on the ``sim_f`` output, the ``gen_f`` can refine a mesh or + * A user has a generator that produces meshes for a + simulator. Based on the simulator output, the generator can refine a mesh or produce a new mesh. libEnsemble ensures that generated meshes can be reused by multiple simulations without requiring data movement. - * A user wants to evaluate a simulation ``sim_f`` with different sets of + * A user wants to evaluate a simulation with different sets of parameters, each drawn from a set of possible values. Some parameter values are known to cause the simulation to fail. libEnsemble can stop unresponsive evaluations and recover computational resources for future - evaluations. The ``gen_f`` can update the sampling strategy after discovering - regions where evaluations of ``sim_f`` fail. + evaluations. The generator can update the sampling strategy after discovering + regions where evaluations of the simulator fail. - * A user has a simulation ``sim_f`` that requires calculating multiple - expensive quantities, some of which depend on other quantities. The ``sim_f`` + * A user has a simulation that requires calculating multiple + expensive quantities, some of which depend on other quantities. The simulator can monitor intermediate quantities to stop related calculations early and preempt future calculations associated with poor parameter values. - * A user has a ``sim_f`` with multiple fidelities, where higher-fidelity - evaluations require more computational resources. A ``gen_f``/``alloc_f`` - pair decides which parameters should be evaluated and - at what fidelity level. libEnsemble coordinates these evaluations without - requiring the user to write parallel code. + * A user has a simulation with multiple fidelities, where higher-fidelity + evaluations require more computational resources. The generator and allocator + decide which parameters should be evaluated and at what fidelity level. libEnsemble + coordinates these evaluations without requiring the user to write parallel code. - * A user wishes to identify multiple local optima for a ``sim_f``. In addition, + * A user wishes to identify multiple local optima for a simulation. In addition, sensitivity analysis is desired at each identified optimum. libEnsemble can - use points from the APOSMM ``gen_f`` to identify optima. After a point is - determined to be an optimum, a different ``gen_f`` can generate the - parameter sets required for sensitivity analysis of ``sim_f``. + use points from the APOSMM generator to identify optima. After a point is + determined to be an optimum, a different generator can generate the + parameter sets required for sensitivity analysis of the simulation. - Combinations of these use cases are also supported. For example, libEnsemble - can be used to solve optimization problems where simulations fail - frequently. + Combinations of these use cases are also supported. Glossary ~~~~~~~~ -Here we define some terms used throughout libEnsemble's code and documentation. -Although many of these terms seem straightforward, defining them helps reduce -confusion when communicating about libEnsemble and -its capabilities. - .. dropdown:: **Click Here for Glossary** :open: @@ -107,46 +77,26 @@ its capabilities. workers and collects their output. * **Worker**: libEnsemble processes responsible for performing units of work, - which may include executing tasks or submitting external jobs. Workers run - generation and simulation routines and return results to the manager. + which may include executing tasks or submitting external jobs. Workers typically + run simulators and return results to the manager. - * **Calling Script**: libEnsemble is typically imported, parameterized, and - initiated in a single Python file referred to as a *calling script*. ``sim_f`` - and ``gen_f`` functions are commonly configured and parameterized here. - - * **User function**: A generator, simulator, or allocation function. These - Python functions govern the libEnsemble workflow. They - must conform to the libEnsemble API for each respective user function, but otherwise can - be created or modified by the user. - libEnsemble includes many examples of each type. - - * **Executor**: The executor provides a simple, portable interface for - launching and managing user tasks (applications). Multiple executors are + * **Executor**: A simple, portable interface for + launching and managing tasks (applications). Multiple executors are available, including the base ``Executor`` and ``MPIExecutor``. - * **Submit**: To enqueue or indicate that one or more jobs or tasks should be - launched. When using the libEnsemble Executor, a *submitted* task is either executed + * **Submit**: A *submitted* task is either executed immediately or queued for execution. - * **Tasks**: Subprocesses or independent units of work. Workers perform - tasks as directed by the manager. Tasks may include launching external - programs for execution using the Executor. - - * **Persistent**: Typically, a worker communicates with the manager - before and after initiating a user ``gen_f`` or ``sim_f`` calculation. Persistent user - functions instead communicate directly with the manager during execution, - allowing them to maintain and update data structures efficiently. These - calculations and their assigned workers are referred to as *persistent*. + * **Tasks**: Subprocesses or independent units of work. Tasks result from + launching external programs for execution using the Executor. - * **Resource Manager**: libEnsemble includes a built-in resource manager that can detect - (or be provided with) available resources (e.g., a node list). Resources are - divided among workers using *resource sets* and can be dynamically - reassigned. + * **Resource Manager**: libEnsemble module that detects + (or is provided with) available resources (e.g., a list of nodes). *Resource sets* are + divided among workers and can be dynamically reassigned. * **Resource Set**: The smallest unit of resources that can be assigned (and dynamically reassigned) to workers. By default this is the provisioned resources - divided by the number of workers. It can also be set - explicitly using the ``num_resource_sets`` ``libE_specs`` option. + divided by the number of workers. It can also be set explicitly using the ``num_resource_sets`` ``libE_specs`` option. * **Slot**: Resource sets enumerated on a node (starting from zero). If a resource set spans multiple nodes, each node is considered to have slot diff --git a/docs/platforms/aurora.rst b/docs/platforms/aurora.rst index 4865ba0c18..c29ed0bc08 100644 --- a/docs/platforms/aurora.rst +++ b/docs/platforms/aurora.rst @@ -27,7 +27,7 @@ To obtain libEnsemble:: pip install libensemble -See :doc:`here<../advanced_installation>` for more information on advanced +See :doc:`here<../advanced_installation/advanced_installation>` for more information on advanced options for installing libEnsemble, including using Spack. Example diff --git a/docs/platforms/bebop.rst b/docs/platforms/bebop.rst index e57172c1b3..2682a54863 100644 --- a/docs/platforms/bebop.rst +++ b/docs/platforms/bebop.rst @@ -46,7 +46,7 @@ To install via ``conda``: conda config --add channels conda-forge conda install -c conda-forge libensemble -See :doc:`here<../advanced_installation>` for more information on advanced options +See :doc:`here<../advanced_installation/advanced_installation>` for more information on advanced options for installing libEnsemble. Job Submission @@ -75,7 +75,7 @@ Now run your script with four workers (one for generator and three for simulatio **three** workers to one allocated compute node, with three nodes available for the workers to launch calculations with the Executor or a launch command. This is an example of running in :doc:`centralized` mode, and, -if using the :doc:`Executor<../executor/mpi_executor>`, libEnsemble should +if using the :doc:`Executor<../executor/ex_index>`, libEnsemble should be initiated with ``libE_specs["dedicated_mode"]=True`` .. note:: diff --git a/docs/platforms/example_scripts.rst b/docs/platforms/example_scripts.rst index d534f0c662..d6d7892abd 100644 --- a/docs/platforms/example_scripts.rst +++ b/docs/platforms/example_scripts.rst @@ -95,10 +95,3 @@ SLURM - MPI / Distributed Mode (co-locate workers & MPI applications) .. literalinclude:: ../../examples/libE_submission_scripts/submit_distrib_mpi4py.sh :caption: /examples/libE_submission_scripts/submit_distrib_mpi4py.sh :language: bash - -Summit (Decommissioned) - On Launch Nodes with Multiprocessing -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. literalinclude:: ../../examples/libE_submission_scripts/summit_submit_mproc.sh - :caption: /examples/libE_submission_scripts/summit_submit_mproc.sh - :language: bash diff --git a/docs/platforms/frontier.rst b/docs/platforms/frontier.rst index 4fdc7a0b36..a57ffadd97 100644 --- a/docs/platforms/frontier.rst +++ b/docs/platforms/frontier.rst @@ -33,7 +33,7 @@ libEnsemble can be installed via pip:: pip install libensemble -See :doc:`advanced installation<../advanced_installation>` for other installation options. +See :doc:`advanced installation<../advanced_installation/advanced_installation>` for other installation options. Example ------- diff --git a/docs/platforms/improv.rst b/docs/platforms/improv.rst index bdb2269a85..dfe40da138 100644 --- a/docs/platforms/improv.rst +++ b/docs/platforms/improv.rst @@ -15,7 +15,7 @@ To create a conda environment and install libEnsemble:: conda activate improv_libe_env pip install libensemble -See :doc:`here<../advanced_installation>` for more information on advanced +See :doc:`here<../advanced_installation/advanced_installation>` for more information on advanced options for installing libEnsemble, including using Spack. Job Submission diff --git a/docs/platforms/perlmutter.rst b/docs/platforms/perlmutter.rst index a2768e1d26..a1c79703f8 100644 --- a/docs/platforms/perlmutter.rst +++ b/docs/platforms/perlmutter.rst @@ -50,7 +50,7 @@ by one of the following ways. conda config --add channels conda-forge conda install -c conda-forge libensemble -See :doc:`advanced installation<../advanced_installation>` for other installation options. +See :doc:`advanced installation<../advanced_installation/advanced_installation>` for other installation options. Job Submission -------------- @@ -161,14 +161,14 @@ Some FAQs specific to Perlmutter. See more on the :doc:`FAQ<../FAQ>` page. #SBATCH --gpus-per-task=1 Instead provide these to sub-tasks via the ``extra_args`` option to - the :doc:`MPIExecutor<../executor/mpi_executor>` ``submit`` function. + the :doc:`MPIExecutor<../executor/ex_index>` ``submit`` function. .. dropdown:: **GTL_DEBUG: [0] cudaHostRegister: no CUDA-capable device is detected** If using the environment variable ``MPICH_GPU_SUPPORT_ENABLED``, then ``srun`` commands, at time of writing, expect an option for allocating GPUs (e.g.~ ``--gpus-per-task=1`` would allocate one GPU to each MPI task of the MPI run). It is recommended that tasks submitted - via the :doc:`MPIExecutor<../executor/mpi_executor>` specify this in the ``extra_args`` + via the :doc:`MPIExecutor<../executor/ex_index>` specify this in the ``extra_args`` option to the ``submit`` function (rather than using an ``#SBATCH`` command). This is needed even when using setting ``CUDA_VISIBLE_DEVICES`` or other options. diff --git a/docs/platforms/platforms_index.rst b/docs/platforms/platforms_index.rst index 591ed445db..d1de30c45d 100644 --- a/docs/platforms/platforms_index.rst +++ b/docs/platforms/platforms_index.rst @@ -19,7 +19,7 @@ Centralized Running ------------------- The default communications scheme places the manager and workers on the first node. -The :doc:`MPI Executor<../executor/mpi_executor>` can then be invoked by each +The :doc:`MPI Executor<../executor/ex_index>` can then be invoked by each simulation worker, and libEnsemble will distribute user applications across the node allocation. This is the **most common approach** where each simulation runs an MPI application. @@ -103,7 +103,7 @@ the nodes within that allocation. *How does libEnsemble know where to run tasks (user applications)?* -The libEnsemble :doc:`MPI Executor<../executor/mpi_executor>` can be initialized from the user calling +The libEnsemble :doc:`MPI Executor<../executor/ex_index>` can be initialized from the user calling script, and then used by workers to run tasks. The Executor will automatically detect the nodes available on most systems. Alternatively, the user can provide a file called **node_list** in the run directory. By default, the Executor will divide up the nodes evenly to each worker. @@ -113,16 +113,9 @@ Mapping Tasks to Resources The :ref:`resource manager` detects node lists from :ref:`common batch schedulers`, -and partitions these to workers. The :doc:`MPI Executor<../executor/mpi_executor>` +and partitions these to workers. The :doc:`MPI Executor<../executor/ex_index>` accesses the resources available to the current worker when launching tasks. -Zero-resource workers ---------------------- - -Users with persistent ``gen_f`` functions may notice that the persistent workers -are still automatically assigned system resources. This can be resolved by -:ref:`fixing the number of resource sets`. - Assigning GPUs -------------- @@ -145,7 +138,7 @@ System detection for resources can be overridden using the :ref:`resource_info` for more. +`custom_info` argument. See the :doc:`MPI Executor<../executor/ex_index>` for more. Systems with Launch/MOM Nodes ----------------------------- @@ -153,8 +146,7 @@ Systems with Launch/MOM Nodes Some large systems have a 3-tier node setup. That is, they have a separate set of launch nodes (known as MOM nodes on Cray Systems). User batch jobs or interactive sessions run on a launch node. Most such systems supply a special MPI runner that has some application-level scheduling -capability (e.g., ``aprun``, ``jsrun``). MPI applications can only be submitted from these nodes. Examples -of these systems include Summit and Sierra. +capability (e.g., ``aprun``, ``jsrun``). MPI applications can only be submitted from these nodes. There are two ways of running libEnsemble on these kinds of systems. The first, and simplest, is to run libEnsemble on the launch nodes. This is often sufficient if the worker's simulation @@ -238,7 +230,6 @@ libEnsemble on specific HPC systems. improv perlmutter polaris - summit srun example_scripts diff --git a/docs/platforms/polaris.rst b/docs/platforms/polaris.rst index 5fdf82aaae..21518ccf42 100644 --- a/docs/platforms/polaris.rst +++ b/docs/platforms/polaris.rst @@ -36,7 +36,7 @@ environment (if you need ``conda install``). More details at `Python for Polaris pip install libensemble -See :doc:`here<../advanced_installation>` for more information on advanced options +See :doc:`here<../advanced_installation/advanced_installation>` for more information on advanced options for installing libEnsemble, including using Spack. Job Submission diff --git a/docs/platforms/srun.rst b/docs/platforms/srun.rst index 5ec8a64839..101b441bc5 100644 --- a/docs/platforms/srun.rst +++ b/docs/platforms/srun.rst @@ -11,7 +11,7 @@ Example SLURM submission scripts for various systems are given in the :doc:`examples`. Further examples are given in some of the specific platform guides (e.g., :doc:`Perlmutter guide`) -By default, the :doc:`MPIExecutor<../executor/mpi_executor>` uses ``mpirun`` +By default, the :doc:`MPIExecutor<../executor/ex_index>` uses ``mpirun`` as a priority over ``srun`` as it works better in some cases. If ``mpirun`` does not work well, then try telling the MPIExecutor to use ``srun`` when it is initiated in the calling script:: @@ -45,14 +45,14 @@ when assigning more than one worker to any given node. #SBATCH --gpus-per-task=1 Instead provide these to sub-tasks via the ``extra_args`` option to the - :doc:`MPIExecutor<../executor/mpi_executor>` ``submit`` function. + :doc:`MPIExecutor<../executor/ex_index>` ``submit`` function. .. dropdown:: **GTL_DEBUG: [0] cudaHostRegister: no CUDA-capable device is detected** If using the environment variable ``MPICH_GPU_SUPPORT_ENABLED``, then ``srun`` commands may expect an option for allocating GPUs (e.g., ``--gpus-per-task=1`` would allocate one GPU to each MPI task of the MPI run). It is recommended that tasks submitted - via the :doc:`MPIExecutor<../executor/mpi_executor>` specify this in the ``extra_args`` + via the :doc:`MPIExecutor<../executor/ex_index>` specify this in the ``extra_args`` option to the ``submit`` function (rather than using an ``#SBATCH`` command). If running the libEnsemble calling script with ``srun``, then it is recommended that diff --git a/docs/platforms/summit.rst b/docs/platforms/summit.rst deleted file mode 100644 index aed321f8e2..0000000000 --- a/docs/platforms/summit.rst +++ /dev/null @@ -1,206 +0,0 @@ -======================= -Summit (Decommissioned) -======================= - -Summit_ was an IBM AC922 system located at the Oak Ridge Leadership Computing -Facility (OLCF). Each of the approximately 4,600 compute nodes on Summit contained two -IBM POWER9 processors and six NVIDIA Volta V100 accelerators. - -Summit featured three tiers of nodes: login, launch, and compute nodes. - -Users on login nodes submit batch runs to the launch nodes. -Batch scripts and interactive sessions run on the launch nodes. Only the launch -nodes can submit MPI runs to the compute nodes via ``jsrun``. - -These docs are maintained to guide libEnsemble's usage on three-tier systems and/or -`jsrun` systems similar to Summit. - -Configuring Python ------------------- - -Begin by loading the Python 3 Anaconda module:: - - $ module load python - -You can now create and activate your own custom conda_ environment:: - - conda create --name myenv python=3.11 - export PYTHONNOUSERSITE=1 # Make sure get python from conda env - . activate myenv - -If you are installing any packages with extensions, ensure that the correct compiler -module is loaded. If using mpi4py_, this must be installed from source, -referencing the compiler. Currently, mpi4py must be built with gcc:: - - module load gcc - -With your environment activated, run :: - - CC=mpicc MPICC=mpicc pip install mpi4py --no-binary mpi4py - -Installing libEnsemble ----------------------- - -Obtaining libEnsemble is now as simple as ``pip install libensemble``. -Your prompt should be similar to the following line: - -.. code-block:: console - - (my_env) user@login5:~$ pip install libensemble - -.. note:: - If you encounter pip errors, run ``python -m pip install --upgrade pip`` first - -Or, you can install via ``conda``: - -.. code-block:: console - - (my_env) user@login5:~$ conda config --add channels conda-forge - (my_env) user@login5:~$ conda install -c conda-forge libensemble - -See :doc:`here<../advanced_installation>` for more information on advanced options -for installing libEnsemble. -Special note on resource sets and Executor submit options - ---------------------------------------------------------- - -When using the portable MPI run configuration options (e.g., num_nodes) to the -:doc:`MPIExecutor<../executor/mpi_executor>` ``submit`` function, it is important -to note that, due to the resource sets used on Summit, the options refer to -resource sets as follows: - -- num_procs (int, optional) – The total number resource sets for this run. - -- num_nodes (int, optional) – The number of nodes on which to submit the run. - -- procs_per_node (int, optional) – The number of resource sets per node. - -It is recommended that the user defines a resource set as the minimal configuration -of CPU cores/processes and GPUs. These can be added to the ``extra_args`` option -of the *submit* function. Alternatively, the portable options can be ignored and -everything expressed in ``extra_args``. - -For example, the following *jsrun* line would run three resource sets, -each having one core (with one process), and one GPU, along with some extra options:: - - jsrun -n 3 -a 1 -g 1 -c 1 --bind=packed:1 --smpiargs="-gpu" - -To express this line in the ``submit`` function may look -something like the following:: - - exctr = Executor.executor - task = exctr.submit(app_name="mycode", - num_procs=3, - extra_args="-a 1 -g 1 -c 1 --bind=packed:1 --smpiargs="-gpu"" - app_args="-i input") - -This would be equivalent to:: - - exctr = Executor.executor - task = exctr.submit(app_name="mycode", - extra_args="-n 3 -a 1 -g 1 -c 1 --bind=packed:1 --smpiargs="-gpu"" - app_args="-i input") - -The libEnsemble resource manager works out the resources available to each worker, -but unlike some other systems, ``jsrun`` on Summit dynamically schedules runs to -available slots across and within nodes. It can also queue tasks. This allows variable -size runs to easily be handled on Summit. If oversubscription to the `jsrun` system -is desired, then libEnsemble's resource manager can be disabled in the -calling script via:: - - libE_specs["disable_resource_manager"] = True - -In the above example, the task being submitted used three GPUs, which is half those -available on a Summit node, and thus two such tasks may be allocated to each node -(from different workers), if they were running at the same time. - -Job Submission --------------- - -Summit used LSF_ for job management and submission. For libEnsemble, the most -important command is ``bsub`` for submitting batch scripts from the login nodes -to execute on the launch nodes. - -It is recommended to run libEnsemble on the launch nodes (assuming workers are -submitting MPI applications) using the ``local`` communications mode (multiprocessing). - -Interactive Runs -^^^^^^^^^^^^^^^^ - -You can run interactively with ``bsub`` by specifying the ``-Is`` flag, -similarly to the following:: - - $ bsub -W 30 -P [project] -nnodes 8 -Is - -This will place you on a launch node. - -.. note:: - You will need to reactivate your conda virtual environment. - -Batch Runs -^^^^^^^^^^ - -Batch scripts specify run settings using ``#BSUB`` statements. The following -simple example depicts configuring and launching libEnsemble to a launch node with -multiprocessing. This script also assumes the user is using the ``parse_args()`` -convenience function from libEnsemble's :doc:`tools module<../utilities>`. - -.. code-block:: bash - - #!/bin/bash -x - #BSUB -P - #BSUB -J libe_mproc - #BSUB -W 60 - #BSUB -nnodes 128 - #BSUB -alloc_flags "smt1" - - # --- Prepare Python --- - - # Load conda module and gcc. - module load python - module load gcc - - # Name of conda environment - export CONDA_ENV_NAME=my_env - - # Activate conda environment - export PYTHONNOUSERSITE=1 - source activate $CONDA_ENV_NAME - - # --- Prepare libEnsemble --- - - # Name of calling script - export EXE=calling_script.py - - # Communication Method - export COMMS="--comms local" - - # Number of workers. - export NWORKERS="--nworkers 128" - - hash -r # Check no commands hashed (pip/python...) - - # Launch libE - python $EXE $COMMS $NWORKERS > out.txt 2>&1 - -With this saved as ``myscript.sh``, allocating, configuring, and queueing -libEnsemble on Summit is achieved by running :: - - $ bsub myscript.sh - -Example submission scripts are also given in the :doc:`examples`. - -Launching User Applications from libEnsemble Workers ----------------------------------------------------- - -Only the launch nodes can submit MPI runs to the compute nodes via ``jsrun``. -This can be accomplished in user simulator functions directly. However, it is highly -recommended that the :doc:`Executor<../executor/ex_index>` interface -be used inside the simulator or generator, because this provides a portable interface -with many advantages including automatic resource detection, portability, -launch failure resilience, and ease of use. - -.. _conda: https://conda.io/en/latest/ -.. _LSF: https://www.olcf.ornl.gov/wp-content/uploads/2018/12/summit_workshop_fuson.pdf -.. _mpi4py: https://mpi4py.readthedocs.io/en/stable/ -.. _Summit: https://www.olcf.ornl.gov/olcf-resources/compute-systems/summit/ diff --git a/docs/posters.rst b/docs/posters.rst deleted file mode 100644 index 78c9af9117..0000000000 --- a/docs/posters.rst +++ /dev/null @@ -1,23 +0,0 @@ -Posters and Presentations -========================= - -Exascale Computing Project 2023 -------------------------------- - -.. raw:: html - - - -SciPy 2020 ----------- - -.. raw:: html - - - -CSE 2019 --------- - -.. raw:: html - - diff --git a/docs/programming_libE.rst b/docs/programming_libE.rst index f4ffaecac6..03e8e97fb6 100644 --- a/docs/programming_libE.rst +++ b/docs/programming_libE.rst @@ -1,8 +1,6 @@ Constructing Workflows ====================== -We now give greater detail in programming with libEnsemble. - .. toctree:: :maxdepth: 2 :caption: The Basics @@ -10,8 +8,6 @@ We now give greater detail in programming with libEnsemble. libe_module data_structures/data_structures history_output_logging - function_guides/history_array - resource_manager/resources_index .. toctree:: :caption: Writing User Functions: diff --git a/docs/resource_manager/overview.rst b/docs/resource_manager/overview.rst index 556e9c0f34..f980eca3b3 100644 --- a/docs/resource_manager/overview.rst +++ b/docs/resource_manager/overview.rst @@ -9,7 +9,7 @@ libEnsemble comes with built-in resource management. This entails the core counts, and GPUs), and the allocation of resources to workers. By default, the provisioned resources are divided by the number of workers. -libEnsemble's :doc:`MPI Executor<../executor/mpi_executor>` is aware of +libEnsemble's :doc:`MPI Executor<../executor/ex_index>` is aware of these supplied resources, and if not given any of ``num_nodes``, ``num_procs``, or ``procs_per_node`` in the submit function, it will try to use all nodes and CPU cores available to the worker. @@ -119,7 +119,7 @@ Accessing resources from the simulation function In the user's simulation function, the resources supplied to the worker can be :doc:`interrogated directly via the resources class attribute`. -libEnsemble's executors (e.g., the :doc:`MPI Executor<../executor/mpi_executor>`) are +libEnsemble's executors (e.g., the :doc:`MPI Executor<../executor/ex_index>`) are aware of these supplied resources, and if not given any of ``num_nodes``, ``num_procs``, or ``procs_per_node`` in the submit function, it will try to use all nodes and CPU cores available. diff --git a/docs/resource_manager/resource_detection.rst b/docs/resource_manager/resource_detection.rst index 2048eb2793..e294b82b9f 100644 --- a/docs/resource_manager/resource_detection.rst +++ b/docs/resource_manager/resource_detection.rst @@ -4,7 +4,7 @@ Resource Detection ================== The resource manager can detect system resources, and partition -these to workers. The :doc:`MPI Executor<../executor/mpi_executor>` +these to workers. The :doc:`MPI Executor<../executor/ex_index>` accesses the resources available to the current worker when launching tasks. Node-lists are detected by an environment variable on the following systems: diff --git a/docs/resource_manager/resources_index.rst b/docs/resource_manager/resources_index.rst index 5ab1f951b3..1802d13872 100644 --- a/docs/resource_manager/resources_index.rst +++ b/docs/resource_manager/resources_index.rst @@ -7,9 +7,7 @@ libEnsemble comes with built-in resource management. This entails the detection of available resources (e.g., nodelists, core counts, and GPUs), and the allocation of resources to workers. -Resource management can be disabled by setting -``libE_specs["disable_resource_manager"] = True``. This will prevent libEnsemble -from doing any resource detection or management. +It can be disabled by setting ``libE_specs["disable_resource_manager"] = True``. .. toctree:: :maxdepth: 2 @@ -19,4 +17,4 @@ from doing any resource detection or management. overview resource_detection scheduler_module - Worker Resources Module (query resources for current worker) + worker_resources diff --git a/docs/resource_manager/zero_resource_workers.rst b/docs/resource_manager/zero_resource_workers.rst deleted file mode 100644 index f60d854336..0000000000 --- a/docs/resource_manager/zero_resource_workers.rst +++ /dev/null @@ -1,69 +0,0 @@ -.. _zero_resource_workers: - -Zero-resource workers -~~~~~~~~~~~~~~~~~~~~~ - -Users with persistent ``gen_f`` functions may notice that the persistent workers -are still automatically assigned resources. This can be wasteful if those workers -only run ``gen_f`` functions in-place (i.e., they do not use the Executor -to submit applications to allocated nodes). Suppose the user is using the -:meth:`parse_args()` function and runs:: - - python run_ensemble_persistent_gen.py --nworkers 3 - -If three nodes are available in the node allocation, the result may look like the -following. - - .. image:: ../images/persis_wasted_node.png - :alt: persis_wasted_node - :scale: 40 - :align: center - -To avoid the the wasted node above, add an extra worker:: - - python run_ensemble_persistent_gen.py --nworkers 4 - -and in the calling script (*run_ensemble_persistent_gen.py*), explicitly set the -number of resource sets to the number of workers that will be running simulations. - -.. code-block:: python - - nworkers, is_manager, libE_specs, _ = parse_args() - libE_specs["num_resource_sets"] = nworkers - 1 - -When the ``num_resource_sets`` option is used, libEnsemble will use the dynamic -resource scheduler, and any worker may assign work to any node. This works well -for most users. - - .. image:: ../images/persis_add_worker.png - :alt: persis_add_worker - :scale: 40 - :align: center - -**Optional**: An alternative way to express the above would be to use the command -line:: - - python run_ensemble_persistent_gen.py --comms local --nsim_workers 3 - -This would automatically set the ``num_resource_sets`` option and add a single -worker for the persistent generator - a common use-case. - -In general, the number of resource sets should be set to enable the maximum -concurrency desired by the ensemble, taking into account generators and simulators. - -Users can set generator resources using the *libE_specs* options -``gen_num_procs`` and/or ``gen_num_gpus``, which take integer values. -If only ``gen_num_gpus`` is set, then the number of processors is set to match. - -To vary generator resources, ``persis_info`` settings can be used in allocation -functions before calling the ``gen_work`` support function. This takes the -same options (``gen_num_procs`` and ``gen_num_gpus``). - -Alternatively, the setting ``persis_info["gen_resources"]`` can also be set to -a number of resource sets. - -The available nodes are always divided by the number of resource sets, and there -may be multiple nodes or a partition of a node in each resource set. If the split -is uneven, resource sets are not split between nodes. For example, if there are -two nodes and five resource sets, one node will have three resource sets, and -the other will have two. diff --git a/docs/running_libE.rst b/docs/running_libE.rst index 50e58afbe5..6329e13e27 100644 --- a/docs/running_libE.rst +++ b/docs/running_libE.rst @@ -3,124 +3,97 @@ Running libEnsemble =================== -Introduction ------------- - -libEnsemble runs with one manager and multiple workers. Each worker may run either -a generator or simulator function (both are Python scripts). Generators -determine the parameters/inputs for simulations. Simulator functions run and -manage simulations, which often involve running a user application (see -:doc:`Executor`). - -To use libEnsemble, you will need a calling script, which in turn will specify -generator and simulator functions. Many :doc:`examples` -are available. - -There are currently three communication options for libEnsemble (determining how -the Manager and Workers communicate). These are ``local``, ``mpi``, ``tcp``. -The default is ``local`` if ``nworkers`` is specified, otherwise ``mpi``. - -Note that ``local`` comms can be used on multi-node systems, where -the :doc:`MPI executor` is used to distribute MPI applications -across the nodes. Indeed, this is the most commonly used option, even on large -supercomputers. - .. note:: You do not need the ``mpi`` communication mode to use the - :doc:`MPI Executor`. The communication modes described + :doc:`MPI Executor`. The communication modes described here only refer to how the libEnsemble manager and workers communicate. -.. tab-set:: - - .. tab-item:: Local Comms - - Uses Python's built-in multiprocessing_ module. - The ``comms`` type ``local`` and number of workers ``nworkers`` may - be provided in :ref:`libE_specs`. - - Then run:: - - python myscript.py +Local Comms +----------- - Or, if the script uses the :meth:`parse_args` function - or an :class:`Ensemble` object with ``Ensemble(parse_args=True)``, - you can specify these on the command line:: +Uses Python's built-in multiprocessing_ module. +The ``comms`` type ``local`` and number of workers ``nworkers`` for running simulators +may be provided in :ref:`libE_specs`. - python myscript.py --nworkers N +Run: - This will launch one manager and ``N`` workers. + python myscript.py - The following abbreviated line is equivalent to the above:: +Or, if the script uses the :meth:`parse_args` function +or an :class:`Ensemble` object with ``Ensemble(parse_args=True)``, +this can be specified on the command line: - python myscript.py -n N + python myscript.py -n N - libEnsemble will run on **one node** in this scenario. To - :doc:`disallow this node` - from app-launches (if running libEnsemble on a compute node), - set ``libE_specs["dedicated_mode"] = True``. +libEnsemble will run on **one node** in this scenario. To +:doc:`disallow this node` +from app-launches (if running libEnsemble on a compute node), +set ``libE_specs["dedicated_mode"] = True``. - This mode can also be used to run on a **launch** node of a three-tier - system (e.g., Summit), ensuring the whole compute-node allocation is available for - launching apps. Make sure there are no imports of ``mpi4py`` in your Python scripts. +This mode can also be used to run on a **launch** node of a three-tier +system, ensuring the whole compute-node allocation is available for +launching apps. Make sure there are no imports of ``mpi4py`` in your Python scripts. - Note that on macOS (since Python 3.8) and Windows, the default multiprocessing method - is ``"spawn"`` instead of ``"fork"``; to resolve many related issues, we recommend placing - calling script code in an ``if __name__ == "__main__":`` block. +Note that on macOS and Windows, the default multiprocessing method is ``"spawn"`` +instead of ``"fork"``; to resolve many related issues, we recommend placing +calling script code in an ``if __name__ == "__main__":`` block. - **Limitations of local mode** +**Limitations of local mode** - - Workers cannot be :doc:`distributed` across nodes. - - In some scenarios, any import of ``mpi4py`` will cause this to break. - - Does not have the potential scaling of MPI mode, but is sufficient for most users. +- Workers cannot be :doc:`distributed` across nodes. +- In some scenarios, any import of ``mpi4py`` will cause this to break. +- Does not have the potential scaling of MPI mode, but is sufficient for most users. - .. tab-item:: MPI Comms +MPI Comms +--------- - This option uses mpi4py_ for the Manager/Worker communication. It is used automatically if - you run your libEnsemble calling script with an MPI runner such as:: +This option uses mpi4py_ for the Manager/Worker communication. It is used automatically if +you run your libEnsemble calling script with an MPI runner such as:: - mpirun -np N python myscript.py + mpirun -np N python myscript.py - where ``N`` is the number of processes. This will launch one manager and - ``N-1`` workers. +where ``N`` is the number of processes. This will launch one manager and +``N-1`` simulator workers. - This option requires ``mpi4py`` to be installed to interface with the MPI on your system. - It works on a standalone system, and with both - :doc:`central and distributed modes` of running libEnsemble on - multi-node systems. +This option requires ``mpi4py`` to be installed to interface with the MPI on your system. +It works on a standalone system, and with both +:doc:`central and distributed modes` of running libEnsemble on +multi-node systems. - It also potentially scales the best when running with many workers on HPC systems. +It also potentially scales the best when running with many workers on HPC systems. - **Limitations of MPI mode** +**Limitations of MPI mode** - If launching MPI applications from workers, then MPI is nested. **This is not - supported with Open MPI**. This can be overcome by using a proxy launcher. - This nesting does work with MPICH_ and its derivative MPI implementations. +If launching MPI applications from workers, then MPI is nested. **This is not +supported with Open MPI**. This can be overcome by using a proxy launcher. +This nesting does work with MPICH_ and its derivative MPI implementations. - It is also unsuitable to use this mode when running on the **launch** nodes of - three-tier systems (e.g., Summit). In that case ``local`` mode is recommended. +It is also unsuitable to use this mode when running on the **launch** nodes of +three-tier systems. In that case ``local`` mode is recommended. - .. tab-item:: TCP Comms +TCP Comms +--------- - Run the Manager on one system and launch workers to remote - systems or nodes over TCP. Configure through - :class:`libE_specs`, or on the command line - if using an :class:`Ensemble` object with - ``Ensemble(parse_args=True)``, +Run the Manager on one system and launch workers to remote +systems or nodes over TCP. Configure through +:class:`libE_specs`, or on the command line +if using an :class:`Ensemble` object with +``Ensemble(parse_args=True)``, - **Reverse-ssh interface** +**Reverse-ssh interface** - Set ``comms`` to ``ssh`` to launch workers on remote ssh-accessible systems. This - co-locates workers, functions, and any applications. User - functions can also be persistent, unlike when launching remote functions via - :ref:`Globus Compute`. +Set ``comms`` to ``ssh`` to launch workers on remote ssh-accessible systems. This +co-locates workers, functions, and any applications. User +functions can also be persistent, unlike when launching remote functions via +:ref:`Globus Compute`. - The remote working directory and Python need to be specified. This may resemble:: +The remote working directory and Python need to be specified. This may resemble:: - python myscript.py --comms ssh --workers machine1 machine2 --worker_pwd /home/workers --worker_python /home/.conda/.../python + python myscript.py --comms ssh --workers machine1 machine2 --worker_pwd /home/workers --worker_python /home/.conda/.../python - **Limitations of TCP mode** +**Limitations of TCP mode** - - There cannot be two calls to ``libE()`` or ``Ensemble.run()`` in the same script. +- There cannot be two calls to ``Ensemble.run()`` or ``libE()`` in the same script. Further Command Line Options ---------------------------- @@ -128,32 +101,6 @@ Further Command Line Options See the :meth:`parse_args` function in :doc:`Convenience Tools` for further command line options. -Persistent Workers ------------------- -.. _persis_worker: - -In a regular (non-persistent) worker, the user's generator or simulation function is called -whenever the worker receives work. A persistent worker is one that continues to run the -generator or simulation function between work units, maintaining the local data environment. - -A common use-case consists of a persistent generator (such as :doc:`persistent_aposmm`) -that maintains optimization data while generating new simulation inputs. The persistent generator runs -on a dedicated worker while in persistent mode. This requires an appropriate -:doc:`allocation function` that will run the generator as persistent. - -When running with a persistent generator, it is important to remember that a worker will be dedicated -to the generator and cannot run simulations. For example, the following run:: - - mpirun -np 3 python my_script.py - -starts one manager, one worker with a persistent generator, and one worker for running simulations. - -If this example was run as:: - - mpirun -np 2 python my_script.py - -No simulations will be able to run. - Environment Variables --------------------- @@ -164,10 +111,10 @@ For example:: set in your simulation script before the Executor *submit* command will export the setting to your run. For running a bash script in a sub environment when using the Executor, see -the ``env_script`` option to the :doc:`MPI Executor`. +the ``env_script`` option to the :doc:`MPI Executor`. -Further Run Information ------------------------ +Running on Multi-Node Systems +----------------------------- For running on multi-node platforms and supercomputers, there are alternative ways to configure libEnsemble to resources. See the :doc:`Running on HPC Systems` @@ -176,4 +123,3 @@ guide for more information, including some examples for specific systems. .. _mpi4py: https://mpi4py.readthedocs.io/en/stable/ .. _MPICH: https://www.mpich.org/ .. _multiprocessing: https://docs.python.org/3/library/multiprocessing.html -.. _PSI/J: https://exaworks.org/psij diff --git a/docs/tutorials/aposmm_tutorial.rst b/docs/tutorials/aposmm_tutorial.rst index 0837df276e..d5b3f4f04a 100644 --- a/docs/tutorials/aposmm_tutorial.rst +++ b/docs/tutorials/aposmm_tutorial.rst @@ -5,8 +5,8 @@ Optimization with APOSMM This tutorial demonstrates libEnsemble's capability to identify multiple minima of simulation output using the built-in :doc:`APOSMM<../examples/aposmm>` (Asynchronously Parallel Optimization Solver for finding Multiple Minima) -:ref:`gen_f`. In this tutorial, we'll create a simple -simulation :ref:`sim_f` that defines a function with +:ref:`gen_f`. In this tutorial, we'll create a simple +simulation :ref:`sim_f` that defines a function with multiple minima, then write a libEnsemble calling script that imports APOSMM and parameterizes it to check for minima over a domain of outputs from our ``sim_f``. @@ -26,35 +26,20 @@ below: :align: center Create a new Python file named ``six_hump_camel.py``. This will be our -``sim_f``, incorporating the above function. Write the following: +simulator callable, incorporating the above function. Write the following: .. code-block:: python :linenos: - import numpy as np - - - def six_hump_camel(H, _, sim_specs): - """Six-Hump Camel sim_f.""" - - batch = len(H["x"]) # Num evaluations each sim_f call. - H_o = np.zeros(batch, dtype=sim_specs["out"]) # Define output array H - - for i, x in enumerate(H["x"]): - H_o["f"][i] = six_hump_camel_func(x) # Function evaluations placed into H - - return H_o - - def six_hump_camel_func(x): """Six-Hump Camel function definition""" - x1 = x[0] - x2 = x[1] + x1 = x["x1"] + x2 = x["x2"] term1 = (4 - 2.1 * x1**2 + (x1**4) / 3) * x1**2 term2 = x1 * x2 term3 = (-4 + 4 * x2**2) * x2**2 - return term1 + term2 + term3 + return {"f": term1 + term2 + term3} APOSMM Operations ----------------- @@ -100,160 +85,83 @@ Throughout, generated and evaluated points are appended to the ``"local_pt"`` being ``True`` if the point is part of a local optimization run, and ``"local_min"`` being ``True`` if the point has been ruled a local minimum. -APOSMM Persistence ------------------- - -APOSMM is implemented as a Persistent generator. A single worker process initiates -APOSMM so that it "persists" the course of a given libEnsemble run. - -APOSMM begins its own concurrent optimization runs, each of which independently -produces a linear sequence of points trying to find a local minimum. These -points are given to workers and evaluated by simulation routines. - -If there are more workers than optimization runs at any iteration of the -generator, additional random sample points are generated to keep the workers -busy. - -In practice, since a single worker becomes "persistent" for APOSMM, users -should initiate one more worker than the number of parallel simulations:: - - python my_aposmm_routine.py --nworkers 4 - -results in three workers running simulations and one running APSOMM. - -If running libEnsemble using `mpi4py` communications, enough MPI ranks should be -given to support libEnsemble's manager, a persistent worker to run APOSMM, and -simulation routines. The following:: - - mpiexec -n 3 python my_aposmm_routine.py - -results in only one worker process to perform simulation evaluations. - Calling Script -------------- -Create a new Python file named ``my_first_aposmm.py``. Start by importing NumPy, -libEnsemble routines, APOSMM, our ``sim_f``, and a specialized allocation -function: +Create a new Python file named ``my_first_aposmm.py``. Start by importing +libEnsemble classes, APOSMM, and our simulator callable: .. code-block:: python :linenos: - import numpy as np + from six_hump_camel import six_hump_camel_func - from six_hump_camel import six_hump_camel + import libensemble.gen_funcs + + libensemble.gen_funcs.rc.aposmm_optimizers = "scipy" - from libensemble.libE import libE - from libensemble.gen_funcs.persistent_aposmm import aposmm - from libensemble.alloc_funcs.persistent_aposmm_alloc import persistent_aposmm_alloc - from libensemble.tools import parse_args + from libensemble import Ensemble + from libensemble.gen_classes import APOSMM + from gest_api.vocs import VOCS + from libensemble.specs import SimSpecs, GenSpecs, ExitCriteria -This allocation function starts a single Persistent APOSMM routine and provides -``sim_f`` output for points requested by APOSMM. Points can be sampled points -or points from local optimization runs. +APOSMM supports a wide variety of external optimizers. The ``rc.aposmm_optimizers`` +statement above indicates to APOSMM which optimization method package to use, +helping prevent unnecessary imports or package installations. -APOSMM supports a wide variety of external optimizers. The following statements -set optimizer settings to ``"scipy"`` to indicate to APOSMM which optimization -method to use, and help prevent unnecessary imports or package installations: +Next, initialize the ``Ensemble`` and define our variables and objectives using +a ``VOCS`` object: .. code-block:: python :linenos: - import libensemble.gen_funcs + if __name__ == "__main__": + workflow = Ensemble(parse_args=True) - libensemble.gen_funcs.rc.aposmm_optimizers = "scipy" + vocs = VOCS( + variables={"x1": [-2, 2], "x2": [-1, 1], "x1_on_cube": [-2, 2], "x2_on_cube": [-1, 1]}, + objectives={"f": "MINIMIZE"}, + ) + +Notice the addition of ``x1_on_cube`` and ``x2_on_cube``. APOSMM requires variables scaled to the unit cube internally. By defining both sets of variables, APOSMM can translate between our actual domain and its internal domain. -Set up :doc:`parse_args()<../utilities>`, -our :doc:`sim_specs<../data_structures/sim_specs>`, -:doc:`gen_specs<../data_structures/gen_specs>`, -and :doc:`alloc_specs<../data_structures/alloc_specs>`: +Now, configure APOSMM. Because APOSMM internally uses variables named ``x``, ``x_on_cube``, and an objective named ``f``, we must map our ``VOCS`` fields to these internal names using ``variables_mapping``: .. code-block:: python :linenos: - nworkers, is_manager, libE_specs, _ = parse_args() - - sim_specs = { - "sim_f": six_hump_camel, # Simulation function - "in": ["x"], # Accepts "x" values - "out": [("f", float)], # Returns f(x) values - } - - gen_out = [ - ("x", float, 2), # Produces "x" values - ("x_on_cube", float, 2), # "x" values scaled to unit cube - ("sim_id", int), # Produces sim_id's for History array indexing - ("local_min", bool), # Is a point a local minimum? - ("local_pt", bool), # Is a point from a local opt run? - ] - - gen_specs = { - "gen_f": aposmm, # APOSMM generator function - "persis_in": ["f"] + [n[0] for n in gen_out], - "out": gen_out, # Output defined like above dict - "user": { - "initial_sample_size": 100, # Random sample 100 points to start - "localopt_method": "scipy_Nelder-Mead", - "opt_return_codes": [0], # Status integers specific to localopt_method - "max_active_runs": 6, # Occur in parallel - "lb": np.array([-2, -1]), # Lower bound of search domain - "ub": np.array([2, 1]), # Upper bound of search domain - }, - } - - alloc_specs = {"alloc_f": persistent_aposmm_alloc} - -``gen_specs["user"]`` fields above that are required for APOSMM are: - - * ``"lb"`` - Search domain lower bound - * ``"ub"`` - Search domain upper bound - * ``"localopt_method"`` - Chosen local optimization method - * ``"initial_sample_size"`` - Number of uniformly sampled points generated - before local optimization runs. - * ``"opt_return_codes"`` - A list of integers that local optimization - methods return when a minimum is detected. SciPy's Nelder-Mead returns 0, - but other methods (not used in this tutorial) return 1. - -Also note the following: - - * ``gen_specs["in"]`` is empty. For other ``gen_f``'s this defines what - fields to give to the ``gen_f`` when called, but here APOSMM's - ``alloc_f`` defines those fields. - * ``"x_on_cube"`` in ``gen_specs["out"]``. APOSMM works internally on - ``"x"`` values scaled to the unit cube. To avoid back-and-forth scaling - issues, both types of ``"x"``'s are communicated back, even though the - simulation will likely use ``"x"`` values. (APOSMM performs handshake to - ensure that the ``x_on_cube`` that was given to be evaluated is the same - the one that is given back.) - * ``"sim_id"`` in ``gen_specs["out"]``. APOSMM produces points in its - local History array that it will need to update later, and can best - reference those points (and avoid a search) if APOSMM produces the IDs - itself, instead of libEnsemble. - -Other options and configurations for APOSMM can be found in the -APOSMM :doc:`API reference<../examples/aposmm>`. - -Set :ref:`exit_criteria` so libEnsemble knows -when to complete, and :ref:`persis_info` for -random sampling seeding: + aposmm = APOSMM( + vocs, + max_active_runs=workflow.nworkers, + variables_mapping={"x": ["x1", "x2"], "x_on_cube": ["x1_on_cube", "x2_on_cube"], "f": ["f"]}, + initial_sample_size=100, + localopt_method="scipy_Nelder-Mead", + opt_return_codes=[0], + ) -.. code-block:: python - :linenos: + workflow.gen_specs = GenSpecs( + generator=aposmm, + vocs=vocs, + batch_size=5, + initial_batch_size=10, + ) - exit_criteria = {"sim_max": 2000} - persis_info = {} +APOSMM is instantiated directly as a standardized generator. It handles its own required fields, simplifying our configurations. ``opt_return_codes`` is a list of integers that local optimization methods return when a minimum is detected. SciPy's Nelder-Mead returns 0. -Finally, add statements to :doc:`initiate libEnsemble<../libe_module>`, and quickly -check calculated minima: +Finally, we configure the simulation function, exit criteria, and run the workflow. We can also print out any points that APOSMM identified as local minima: .. code-block:: python :linenos: - if __name__ == "__main__": # required by multiprocessing on macOS and windows - H, persis_info, flag = libE(sim_specs, gen_specs, exit_criteria, persis_info, alloc_specs, libE_specs) + workflow.sim_specs = SimSpecs(simulator=six_hump_camel_func, vocs=vocs) + workflow.exit_criteria = ExitCriteria(sim_max=2000) + + H, _, _ = workflow.run() - if is_manager: - print("Minima:", H[np.where(H["local_min"])]["x"]) + if workflow.is_manager: + # We can map our variables back to an array for easy printing + minima = [[row["x1"], row["x2"]] for row in H if row["local_min"]] + print("Minima:", minima) Final Setup, Run, and Output ---------------------------- @@ -272,27 +180,10 @@ the routine. After a couple seconds, the output should resemble the following:: - [0] libensemble.libE (MANAGER_WARNING): - ******************************************************************************* - User generator script will be creating sim_id. - Take care to do this sequentially. - Also, any information given back for existing sim_id values will be overwritten! - So everything in gen_specs["out"] should be in gen_specs["in"]! - ******************************************************************************* - - Minima: [[ 0.08993295 -0.71265804] - [ 1.70360676 -0.79614982] - [-1.70368421 0.79606073] - [-0.08988064 0.71270945] - [-1.60699361 -0.56859108] - [ 1.60713962 0.56869567]] - -The first section labeled ``MANAGER_WARNING`` is a default libEnsemble warning -for generator functions that create ``sim_id``'s, like APOSMM. It does not -indicate a failure. + Minima: [[0.08988580227184285, -0.7126604246830723], [-0.08983226938927827, 0.7126622830878125], [-1.7036480556534283, 0.7960787201083437], [1.7035677028481488, -0.7961234727197022], [1.607106093246473, 0.5686524941018596], [-1.607102046898864, -0.568650772274404]] The local minima for the Six-Hump Camel simulation function as evaluated by -APOSMM with libEnsemble should be listed directly below the warning. +APOSMM with libEnsemble should be listed directly above. Please see the API reference :doc:`here<../examples/aposmm>` for more APOSMM configuration options and other information. @@ -304,7 +195,7 @@ Applications APOSMM is not limited to evaluating minima from pure Python simulation functions. Many common libEnsemble use-cases involve using -libEnsemble's :doc:`MPI Executor<../executor/overview>` to launch user +libEnsemble's :doc:`MPI Executor<../executor/ex_index>` to launch user applications with parameters requested by APOSMM, then evaluate their output using APOSMM, and repeat until minima are identified. A currently supported example can be found in libEnsemble's `WarpX Scaling Test`_. diff --git a/docs/tutorials/calib_cancel_tutorial.rst b/docs/tutorials/calib_cancel_tutorial.rst index c008100d73..316e56ba1d 100644 --- a/docs/tutorials/calib_cancel_tutorial.rst +++ b/docs/tutorials/calib_cancel_tutorial.rst @@ -12,7 +12,7 @@ compute resources may then be more effectively applied toward critical evaluatio For a somewhat different approach than libEnsemble's :doc:`other tutorials`, we'll emphasize the settings, functions, and data fields within the calling script, -:ref:`persistent generator`, manager, and :ref:`sim_f` +:ref:`persistent generator`, manager, and :ref:`sim_f` that make this capability possible, rather than outlining a step-by-step process. The libEnsemble regression test ``test_persistent_surmise_calib.py`` demonstrates @@ -36,7 +36,7 @@ gravitational constant, and the corresponding computer model could be the set of differential equations that govern the drop. In a case where the computation of the computer model is relatively expensive, we employ a fast surrogate model to approximate the model and to inform good parameters to test next. Here the computer -model :math:`f(\theta, x)` is accessible only through performing :ref:`sim_f` +model :math:`f(\theta, x)` is accessible only through performing :ref:`sim_f` evaluations. As a convenience for testing, the ``observed`` data values are modelled by calling the ``sim_f`` @@ -213,8 +213,8 @@ by a user function, otherwise it will be ignored. To demonstrate this, the test captures and processes this signal from the manager. In order to do this, a compiled version of the borehole function is launched by ``sim_funcs/borehole_kills.py`` -via the :doc:`Executor<../executor/overview>`. As the borehole application used here is serial, we use the -:doc:`Executor base class<../executor/executor>` rather than the commonly used :doc:`MPIExecutor<../executor/mpi_executor>` +via the :doc:`Executor<../executor/ex_index>`. As the borehole application used here is serial, we use the +:doc:`Executor base class<../executor/ex_index>` rather than the commonly used :doc:`MPIExecutor<../executor/ex_index>` class. The base Executor submit routine simply sub-processes a serial application in-place. After the initial sample batch of evaluations has been processed, an artificial delay is added to the sub-processed borehole to allow time to receive the kill signal and terminate the application. Killed simulations will be reported at diff --git a/docs/tutorials/executor_forces_tutorial.rst b/docs/tutorials/executor_forces_tutorial.rst index a083aa2a82..33142fb601 100644 --- a/docs/tutorials/executor_forces_tutorial.rst +++ b/docs/tutorials/executor_forces_tutorial.rst @@ -4,7 +4,7 @@ Ensemble with an MPI Application This tutorial highlights libEnsemble's capability to portably execute and monitor external scripts or user applications within simulation or generator -functions using the :doc:`executor<../executor/overview>`. +functions using the :doc:`executor<../executor/ex_index>`. |Open in Colab| @@ -13,7 +13,7 @@ electrostatic forces between a collection of particles. The simulator function launches instances of this executable and reads output files to determine the result. -This tutorial uses libEnsemble's :doc:`MPI Executor<../executor/mpi_executor>`, +This tutorial uses libEnsemble's :doc:`MPI Executor<../executor/ex_index>`, which automatically detects available MPI runners and resources. This example also uses a persistent generator. This generator runs on a @@ -49,7 +49,7 @@ generation functions and call libEnsemble. Create a Python file called :linenos: :end-at: ensemble = Ensemble -We first instantiate our :doc:`MPI Executor<../executor/mpi_executor>`. +We first instantiate our :doc:`MPI Executor<../executor/ex_index>`. Registering an application is as easy as providing the full file-path and giving it a memorable name. This Executor will later be used within our simulation function to launch the registered app. @@ -82,32 +82,15 @@ expect, and also to parameterize user functions: :end-at: gen_specs_end_tag :lineno-start: 37 -Next, configure an allocation function, which starts the one persistent -generator and farms out the simulations. We also tell it to wait for all -simulations to return their results, before generating more parameters. - -.. literalinclude:: ../../libensemble/tests/functionality_tests/test_executor_forces_tutorial.py - :language: python - :linenos: - :start-at: ensemble.alloc_specs = AllocSpecs - :end-at: ) - :lineno-start: 55 - -Now we set :ref:`exit_criteria` to -exit after running eight simulations. - -We also give each worker a seeded random stream, via the -:ref:`persis_info` option. -These can be used for random number generation if required. - -Finally we :doc:`run<../libe_module>` the ensemble. +Next, we set :ref:`exit_criteria` to +exit after running eight simulations, and finally we :doc:`run<../libe_module>` the ensemble. .. literalinclude:: ../../libensemble/tests/functionality_tests/test_executor_forces_tutorial.py :language: python :linenos: :start-at: Instruct libEnsemble :end-at: ensemble.run() - :lineno-start: 62 + :lineno-start: 55 Exercise ^^^^^^^^ diff --git a/docs/tutorials/local_sine_tutorial.rst b/docs/tutorials/local_sine_tutorial.rst deleted file mode 100644 index 49b36b015b..0000000000 --- a/docs/tutorials/local_sine_tutorial.rst +++ /dev/null @@ -1,275 +0,0 @@ -=================== -Simple Introduction -=================== - -This tutorial demonstrates the capability to perform ensembles of -calculations in parallel using :doc:`libEnsemble<../introduction>`. - -We recommend reading this brief :doc:`Overview<../overview_usecases>`. - -|Open in Colab| - -For this tutorial, our generator will produce uniform randomly sampled -values, and our simulator will calculate the sine of each. By default we don't -need to write a new allocation function. - -.. tab-set:: - - .. tab-item:: 1. Getting started - - libEnsemble is written entirely in Python_. Let's make sure - the correct version is installed. - - .. code-block:: bash - - python --version # This should be >= 3.11 - - .. _Python: https://www.python.org/ - - For this tutorial, you need NumPy_ and (optionally) - Matplotlib_ to visualize your results. Install libEnsemble and these other - libraries with - - .. code-block:: bash - - pip install libensemble - pip install matplotlib # Optional - - If your system doesn't allow you to perform these installations, try adding - ``--user`` to the end of each command. - - .. tab-item:: 2. Generator - - Let's begin the coding portion of this tutorial by writing our generator. - - An available libEnsemble worker will call this generator's ``.suggest()`` method to obtain - new values to evaluate. - - For now, create a new Python file named ``sine_gen.py``. Write the following: - - .. literalinclude:: ../../libensemble/tests/functionality_tests/sine_gen_std.py - :language: python - :linenos: - :caption: examples/tutorials/simple_sine/sine_gen_std.py - - libEnsemble accepts generators that implement the gest-api_ interface. These generators - accept a ``gest_api.VOCS`` object for configuration, and contain a ``.suggest(num_points)`` - method that returns ``num_points`` points. Points consist of a list of dictionaries - with keys that match the variable names from the ``gest_api.VOCS`` object. - - Our generator's ``suggest()`` method creates ``num_points`` dictionaries. For each key in - the generator's ``self.variables``, it creates a random number uniformly distributed - between the corresponding ``lower`` and ``upper`` bounds of its domain. - - Our generator must implement a ``_validate_vocs()`` method. Here, we implement a simple - check that ensures the ``VOCS`` object has at least one variable. - - .. tab-item:: 3. Simulator - - Next, we'll write our simulator function or :ref:`sim_f`. Simulator - functions perform calculations based on values from the generator. - :ref:`sim_specs` is a dictionary containing user-defined fields - and parameters. - - Create a new Python file named ``sine_sim.py``. Write the following: - - .. literalinclude:: ../../libensemble/tests/functionality_tests/sine_sim.py - :language: python - :linenos: - :caption: examples/tutorials/simple_sine/sine_sim.py - - Our simulator function is called by a worker for every work item produced by - the generator. This function calculates the sine of the passed value, - and then returns it so the worker can store the result. - - .. tab-item:: 4. Script - - Now lets write the script that configures our generator and simulator - functions and starts libEnsemble. - - Create an empty Python file named ``calling.py``. - In this file, we'll start by importing NumPy, libEnsemble's setup classes, the generator, - and simulator function. - - In a class called :ref:`LibeSpecs` we'll - specify the number of workers and the manager/worker intercommunication method. - ``"local"``, refers to Python's multiprocessing. - - .. literalinclude:: ../../libensemble/tests/functionality_tests/test_local_sine_tutorial.py - :language: python - :linenos: - :end-at: libE_specs = LibeSpecs - - We configure the settings and specifications for our ``sim_f`` and ``gen_f`` - functions in the :ref:`GenSpecs` and - :ref:`SimSpecs` classes, which we saw previously - being passed to our functions *as dictionaries*. - These classes also describe to libEnsemble what inputs and outputs from those - functions to expect. - - .. literalinclude:: ../../libensemble/tests/functionality_tests/test_local_sine_tutorial.py - :language: python - :linenos: - :lineno-start: 10 - :start-at: gen_specs = GenSpecs - :end-at: sim_specs_end_tag - - We then specify the circumstances where - libEnsemble should stop execution in :ref:`ExitCriteria`. - - .. literalinclude:: ../../libensemble/tests/functionality_tests/test_local_sine_tutorial.py - :language: python - :linenos: - :lineno-start: 26 - :start-at: exit_criteria = ExitCriteria - :end-at: exit_criteria = ExitCriteria - - Now we're ready to write our libEnsemble :doc:`libE<../programming_libE>` - function call. :ref:`ensemble.H` is the final version of - the history array. ``ensemble.flag`` should be zero if no errors occur. - - .. literalinclude:: ../../libensemble/tests/functionality_tests/test_local_sine_tutorial.py - :language: python - :linenos: - :lineno-start: 28 - :start-at: ensemble = Ensemble - :end-at: print(history) - - That's it! Now that these files are complete, we can run our simulation. - - .. code-block:: bash - - python calling.py - - If everything ran perfectly and you included the above print statements, you - should get something similar to the following output (although the - columns might be rearranged). - - .. code-block:: - - ["y", "sim_started_time", "gen_worker", "sim_worker", "sim_started", "sim_ended", "x", "allocated", "sim_id", "gen_ended_time"] - [(-0.37466051, 1.559+09, 2, 2, True, True, [-0.38403059], True, 0, 1.559+09) - (-0.29279634, 1.559+09, 2, 3, True, True, [-2.84444261], True, 1, 1.559+09) - ( 0.29358492, 1.559+09, 2, 4, True, True, [ 0.29797487], True, 2, 1.559+09) - (-0.3783986, 1.559+09, 2, 1, True, True, [-0.38806564], True, 3, 1.559+09) - (-0.45982062, 1.559+09, 2, 2, True, True, [-0.47779319], True, 4, 1.559+09) - ... - - In this arrangement, our output values are listed on the far left with the - generated values being the fourth column from the right. - - Two additional log files should also have been created. - ``ensemble.log`` contains debugging or informational logging output from - libEnsemble, while ``libE_stats.txt`` contains a quick summary of all - calculations performed. - - Here is graphed output using ``Matplotlib``, with entries colored by which - worker performed the simulation: - - .. image:: ../images/sinex.png - :alt: sine - :align: center - - If you want to verify your results through plotting and installed Matplotlib - earlier, copy and paste the following code into the bottom of your calling - script and run ``python calling.py`` again - - .. literalinclude:: ../../libensemble/tests/functionality_tests/test_local_sine_tutorial.py - :language: python - :linenos: - :lineno-start: 37 - :start-at: import matplotlib - :end-at: plt.savefig("tutorial_sines.png") - - Each of these example files can be found in the repository in `examples/tutorials/simple_sine`_. - - **Exercise** - - Write a Calling Script with the following specifications: - - 1. Set the generator function's lower and upper bounds to -6 and 6, respectively - 2. Increase the generator batch size to 10 - 3. Set libEnsemble to stop execution after 160 *generations* using the ``gen_max`` option - 4. Print an error message if any errors occurred while libEnsemble was running - - .. dropdown:: **Click Here for Solution** - - .. literalinclude:: ../../libensemble/tests/functionality_tests/test_local_sine_tutorial_2.py - :language: python - :linenos: - :emphasize-lines: 15,16,17,27,33,34 - - .. tab-item:: 5. Next steps - - **libEnsemble with MPI** - - MPI_ is a standard interface for parallel computing, implemented in libraries - such as MPICH_ and used at extreme scales. MPI potentially allows libEnsemble's - processes to be distributed over multiple nodes and works in some - circumstances where Python's multiprocessing does not. In this section, we'll - explore modifying the above code to use MPI instead of multiprocessing. - - We recommend the MPI distribution MPICH_ for this tutorial, which can be found - for a variety of systems here_. You also need mpi4py_, which can be installed - with ``pip install mpi4py``. If you'd like to use a specific version or - distribution of MPI instead of MPICH, configure mpi4py with that MPI at - installation with ``MPICC= pip install mpi4py`` If this - doesn't work, try appending ``--user`` to the end of the command. See the - mpi4py_ docs for more information. - - Verify that MPI has been installed correctly with ``mpirun --version``. - - **Modifying the script** - - Only a few changes are necessary to make our code MPI-compatible. For starters, - comment out the ``libE_specs`` definition: - - .. literalinclude:: ../../libensemble/tests/functionality_tests/test_local_sine_tutorial_3.py - :language: python - :start-at: # libE_specs = LibeSpecs - :end-at: # libE_specs = LibeSpecs - - We'll be parameterizing our MPI runtime with a ``parse_args=True`` argument to - the ``Ensemble`` class instead of ``libE_specs``. We'll also use an ``ensemble.is_manager`` - attribute so only the first MPI rank runs the data-processing code. - - The bottom of your calling script should now resemble: - - .. literalinclude:: ../../libensemble/tests/functionality_tests/test_local_sine_tutorial_3.py - :linenos: - :lineno-start: 28 - :language: python - :start-at: # replace libE_specs - - With these changes in place, our libEnsemble code can be run with MPI by - - .. code-block:: bash - - mpirun -n 5 python calling.py - - where ``-n 5`` tells ``mpirun`` to produce five processes, one of which will be - the manager process with the libEnsemble manager and the other four will run - libEnsemble workers. - - This tutorial is only a tiny demonstration of the parallelism capabilities of - libEnsemble. libEnsemble has been developed primarily to support research on - High-Performance computers, with potentially hundreds of workers performing - calculations simultaneously. Please read our - :doc:`platform guides <../platforms/platforms_index>` for introductions to using - libEnsemble on many such machines. - - libEnsemble's Executors can launch non-Python user applications and simulations across - allocated compute resources. Try out this feature with a more-complicated - libEnsemble use-case within our - :doc:`Electrostatic Forces tutorial <./executor_forces_tutorial>`. - -.. _gest-api: https://github.com/campa-consortium/gest-api -.. _Matplotlib: https://matplotlib.org/ -.. _MPI: https://en.wikipedia.org/wiki/Message_Passing_Interface -.. _MPICH: https://www.mpich.org/ -.. _mpi4py: https://mpi4py.readthedocs.io/en/stable/install.html -.. _NumPy: https://www.numpy.org/ -.. _here: https://www.mpich.org/downloads/ -.. _examples/tutorials/simple_sine: https://github.com/Libensemble/libensemble/tree/develop/examples/tutorials/simple_sine -.. |Open in Colab| image:: https://colab.research.google.com/assets/colab-badge.svg - :target: http://colab.research.google.com/github/Libensemble/libensemble/blob/develop/examples/tutorials/simple_sine/sine_tutorial_notebook.ipynb diff --git a/docs/tutorials/local_sine_tutorial/local_sine_tutorial.rst b/docs/tutorials/local_sine_tutorial/local_sine_tutorial.rst new file mode 100644 index 0000000000..d5e587a0f0 --- /dev/null +++ b/docs/tutorials/local_sine_tutorial/local_sine_tutorial.rst @@ -0,0 +1,28 @@ +=================== +Simple Introduction +=================== + +**Introduction** \|\| `1. Getting started `__ \|\| `2. Generator `__ \|\| `3. Simulator `__ \|\| `4. Script `__ \|\| `5. Next steps `__ + +This tutorial demonstrates the capability to perform ensembles of +calculations in parallel using :doc:`libEnsemble<../../introduction>`. + +We recommend reading this brief :doc:`Overview<../../overview_usecases>`. + +|Open in Colab| + +For this tutorial, our generator will produce uniform randomly sampled +values, and our simulator will calculate the sine of each. By default we don't +need to write a new allocation function. + +.. toctree:: + :hidden: + + local_sine_tutorial_1 + local_sine_tutorial_2 + local_sine_tutorial_3 + local_sine_tutorial_4 + local_sine_tutorial_5 + +.. |Open in Colab| image:: https://colab.research.google.com/assets/colab-badge.svg + :target: http://colab.research.google.com/github/Libensemble/libensemble/blob/develop/examples/tutorials/simple_sine/sine_tutorial_notebook.ipynb diff --git a/docs/tutorials/local_sine_tutorial/local_sine_tutorial_1.rst b/docs/tutorials/local_sine_tutorial/local_sine_tutorial_1.rst new file mode 100644 index 0000000000..5c5db2ec82 --- /dev/null +++ b/docs/tutorials/local_sine_tutorial/local_sine_tutorial_1.rst @@ -0,0 +1,28 @@ +1. Getting started +================== + +`Introduction `__ \|\| **1. Getting started** \|\| `2. Generator `__ \|\| `3. Simulator `__ \|\| `4. Script `__ \|\| `5. Next steps `__ + +libEnsemble is written entirely in Python_. Let's make sure +the correct version is installed. + +.. code-block:: bash + + python --version # This should be >= 3.11 + +.. _Python: https://www.python.org/ + +For this tutorial, you need NumPy_ and (optionally) +Matplotlib_ to visualize your results. Install libEnsemble and these other +libraries with + +.. code-block:: bash + + pip install libensemble + pip install matplotlib # Optional + +If your system doesn't allow you to perform these installations, try adding +``--user`` to the end of each command. + +.. _Matplotlib: https://matplotlib.org/ +.. _NumPy: https://www.numpy.org/ diff --git a/docs/tutorials/local_sine_tutorial/local_sine_tutorial_2.rst b/docs/tutorials/local_sine_tutorial/local_sine_tutorial_2.rst new file mode 100644 index 0000000000..024bb52d14 --- /dev/null +++ b/docs/tutorials/local_sine_tutorial/local_sine_tutorial_2.rst @@ -0,0 +1,30 @@ +2. Generator +============ + +`Introduction `__ \|\| `1. Getting started `__ \|\| **2. Generator** \|\| `3. Simulator `__ \|\| `4. Script `__ \|\| `5. Next steps `__ + +Let's begin the coding portion of this tutorial by writing our generator. + +An available libEnsemble worker will call this generator's ``.suggest()`` method to obtain +new values to evaluate. + +For now, create a new Python file named ``sine_gen.py``. Write the following: + +.. literalinclude:: ../../../libensemble/tests/functionality_tests/sine_gen_std.py + :language: python + :linenos: + :caption: examples/tutorials/simple_sine/sine_gen_std.py + +libEnsemble accepts generators that implement the gest-api_ interface. These generators +accept a ``gest_api.VOCS`` object for configuration, and contain a ``.suggest(num_points)`` +method that returns ``num_points`` points. Points consist of a list of dictionaries +with keys that match the variable names from the ``gest_api.VOCS`` object. + +Our generator's ``suggest()`` method creates ``num_points`` dictionaries. For each key in +the generator's ``self.variables``, it creates a random number uniformly distributed +between the corresponding ``lower`` and ``upper`` bounds of its domain. + +Our generator must implement a ``_validate_vocs()`` method. Here, we implement a simple +check that ensures the ``VOCS`` object has at least one variable. + +.. _gest-api: https://github.com/campa-consortium/gest-api diff --git a/docs/tutorials/local_sine_tutorial/local_sine_tutorial_3.rst b/docs/tutorials/local_sine_tutorial/local_sine_tutorial_3.rst new file mode 100644 index 0000000000..05836abf32 --- /dev/null +++ b/docs/tutorials/local_sine_tutorial/local_sine_tutorial_3.rst @@ -0,0 +1,20 @@ +3. Simulator +============ + +`Introduction `__ \|\| `1. Getting started `__ \|\| `2. Generator `__ \|\| **3. Simulator** \|\| `4. Script `__ \|\| `5. Next steps `__ + +Next, we'll write our simulator function or :ref:`sim_f`. Simulator +functions perform calculations based on values from the generator. +:ref:`sim_specs` is a dictionary containing user-defined fields +and parameters. + +Create a new Python file named ``sine_sim.py``. Write the following: + +.. literalinclude:: ../../../libensemble/tests/functionality_tests/sine_sim.py + :language: python + :linenos: + :caption: examples/tutorials/simple_sine/sine_sim.py + +Our simulator function is called by a worker for every work item produced by +the generator. This function calculates the sine of the passed value, +and then returns it so the worker can store the result. diff --git a/docs/tutorials/local_sine_tutorial/local_sine_tutorial_4.rst b/docs/tutorials/local_sine_tutorial/local_sine_tutorial_4.rst new file mode 100644 index 0000000000..92a5c8536b --- /dev/null +++ b/docs/tutorials/local_sine_tutorial/local_sine_tutorial_4.rst @@ -0,0 +1,121 @@ +4. Script +========= + +`Introduction `__ \|\| `1. Getting started `__ \|\| `2. Generator `__ \|\| `3. Simulator `__ \|\| **4. Script** \|\| `5. Next steps `__ + +Now lets write the script that configures our generator and simulator +functions and starts libEnsemble. + +Create an empty Python file named ``calling.py``. +In this file, we'll start by importing NumPy, libEnsemble's setup classes, the generator, +and simulator function. + +In a class called :ref:`LibeSpecs` we'll +specify the number of workers and the manager/worker intercommunication method. +``"local"``, refers to Python's multiprocessing. + +.. literalinclude:: ../../../libensemble/tests/functionality_tests/test_local_sine_tutorial.py + :language: python + :linenos: + :end-at: libE_specs = LibeSpecs + +We configure the settings and specifications for our ``sim_f`` and ``gen_f`` +functions in the :ref:`GenSpecs` and +:ref:`SimSpecs` classes, which we saw previously +being passed to our functions *as dictionaries*. +These classes also describe to libEnsemble what inputs and outputs from those +functions to expect. + +.. literalinclude:: ../../../libensemble/tests/functionality_tests/test_local_sine_tutorial.py + :language: python + :linenos: + :lineno-start: 10 + :start-at: gen_specs = GenSpecs + :end-at: sim_specs_end_tag + +We then specify the circumstances where +libEnsemble should stop execution in :ref:`ExitCriteria`. + +.. literalinclude:: ../../../libensemble/tests/functionality_tests/test_local_sine_tutorial.py + :language: python + :linenos: + :lineno-start: 26 + :start-at: exit_criteria = ExitCriteria + :end-at: exit_criteria = ExitCriteria + +Now we're ready to write our libEnsemble :doc:`libE<../../programming_libE>` +function call. :ref:`ensemble.H` is the final version of +the history array. ``ensemble.flag`` should be zero if no errors occur. + +.. literalinclude:: ../../../libensemble/tests/functionality_tests/test_local_sine_tutorial.py + :language: python + :linenos: + :lineno-start: 28 + :start-at: ensemble = Ensemble + :end-at: print(history) + +That's it! Now that these files are complete, we can run our simulation. + +.. code-block:: bash + + python calling.py + +If everything ran perfectly and you included the above print statements, you +should get something similar to the following output (although the +columns might be rearranged). + +.. code-block:: + + ["y", "sim_started_time", "gen_worker", "sim_worker", "sim_started", "sim_ended", "x", "allocated", "sim_id", "gen_ended_time"] + [(-0.37466051, 1.559+09, 2, 2, True, True, [-0.38403059], True, 0, 1.559+09) + (-0.29279634, 1.559+09, 2, 3, True, True, [-2.84444261], True, 1, 1.559+09) + ( 0.29358492, 1.559+09, 2, 4, True, True, [ 0.29797487], True, 2, 1.559+09) + (-0.3783986, 1.559+09, 2, 1, True, True, [-0.38806564], True, 3, 1.559+09) + (-0.45982062, 1.559+09, 2, 2, True, True, [-0.47779319], True, 4, 1.559+09) + ... + +In this arrangement, our output values are listed on the far left with the +generated values being the fourth column from the right. + +Two additional log files should also have been created. +``ensemble.log`` contains debugging or informational logging output from +libEnsemble, while ``libE_stats.txt`` contains a quick summary of all +calculations performed. + +Here is graphed output using ``Matplotlib``, with entries colored by which +worker performed the simulation: + +.. image:: ../../images/sinex.png + :alt: sine + :align: center + +If you want to verify your results through plotting and installed Matplotlib +earlier, copy and paste the following code into the bottom of your calling +script and run ``python calling.py`` again + +.. literalinclude:: ../../../libensemble/tests/functionality_tests/test_local_sine_tutorial.py + :language: python + :linenos: + :lineno-start: 37 + :start-at: import matplotlib + :end-at: plt.savefig("tutorial_sines.png") + +Each of these example files can be found in the repository in `examples/tutorials/simple_sine`_. + +**Exercise** + +Write a Calling Script with the following specifications: + +1. Set the generator function's lower and upper bounds to -6 and 6, respectively +2. Increase the generator batch size to 10 +3. Set libEnsemble to stop execution after 160 *generations* using the ``gen_max`` option +4. Print an error message if any errors occurred while libEnsemble was running + +.. dropdown:: **Click Here for Solution** + + .. literalinclude:: ../../../libensemble/tests/functionality_tests/test_local_sine_tutorial_2.py + :language: python + :linenos: + :emphasize-lines: 15,16,17,27,33,34 + +.. _examples/tutorials/simple_sine: https://github.com/Libensemble/libensemble/tree/develop/examples/tutorials/simple_sine diff --git a/docs/tutorials/local_sine_tutorial/local_sine_tutorial_5.rst b/docs/tutorials/local_sine_tutorial/local_sine_tutorial_5.rst new file mode 100644 index 0000000000..5c67c73df7 --- /dev/null +++ b/docs/tutorials/local_sine_tutorial/local_sine_tutorial_5.rst @@ -0,0 +1,71 @@ +5. Next steps +============= + +`Introduction `__ \|\| `1. Getting started `__ \|\| `2. Generator `__ \|\| `3. Simulator `__ \|\| `4. Script `__ \|\| **5. Next steps** + +**libEnsemble with MPI** + +MPI_ is a standard interface for parallel computing, implemented in libraries +such as MPICH_ and used at extreme scales. MPI potentially allows libEnsemble's +processes to be distributed over multiple nodes and works in some +circumstances where Python's multiprocessing does not. In this section, we'll +explore modifying the above code to use MPI instead of multiprocessing. + +We recommend the MPI distribution MPICH_ for this tutorial, which can be found +for a variety of systems here_. You also need mpi4py_, which can be installed +with ``pip install mpi4py``. If you'd like to use a specific version or +distribution of MPI instead of MPICH, configure mpi4py with that MPI at +installation with ``MPICC= pip install mpi4py`` If this +doesn't work, try appending ``--user`` to the end of the command. See the +mpi4py_ docs for more information. + +Verify that MPI has been installed correctly with ``mpirun --version``. + +**Modifying the script** + +Only a few changes are necessary to make our code MPI-compatible. For starters, +comment out the ``libE_specs`` definition: + +.. literalinclude:: ../../../libensemble/tests/functionality_tests/test_local_sine_tutorial_3.py + :language: python + :start-at: # libE_specs = LibeSpecs + :end-at: # libE_specs = LibeSpecs + +We'll be parameterizing our MPI runtime with a ``parse_args=True`` argument to +the ``Ensemble`` class instead of ``libE_specs``. We'll also use an ``ensemble.is_manager`` +attribute so only the first MPI rank runs the data-processing code. + +The bottom of your calling script should now resemble: + +.. literalinclude:: ../../../libensemble/tests/functionality_tests/test_local_sine_tutorial_3.py + :linenos: + :lineno-start: 28 + :language: python + :start-at: # replace libE_specs + +With these changes in place, our libEnsemble code can be run with MPI by + +.. code-block:: bash + + mpirun -n 5 python calling.py + +where ``-n 5`` tells ``mpirun`` to produce five processes, one of which will be +the manager process with the libEnsemble manager and the other four will run +libEnsemble workers. + +This tutorial is only a tiny demonstration of the parallelism capabilities of +libEnsemble. libEnsemble has been developed primarily to support research on +High-Performance computers, with potentially hundreds of workers performing +calculations simultaneously. Please read our +:doc:`platform guides <../../platforms/platforms_index>` for introductions to using +libEnsemble on many such machines. + +libEnsemble's Executors can launch non-Python user applications and simulations across +allocated compute resources. Try out this feature with a more-complicated +libEnsemble use-case within our +:doc:`Electrostatic Forces tutorial <../executor_forces_tutorial>`. + +.. _MPI: https://en.wikipedia.org/wiki/Message_Passing_Interface +.. _MPICH: https://www.mpich.org/ +.. _here: https://www.mpich.org/downloads/ +.. _mpi4py: https://mpi4py.readthedocs.io/en/stable/install.html diff --git a/docs/tutorials/tutorials.rst b/docs/tutorials/tutorials.rst index cee04fe523..1ea0edc10e 100644 --- a/docs/tutorials/tutorials.rst +++ b/docs/tutorials/tutorials.rst @@ -3,7 +3,7 @@ Tutorials .. toctree:: - local_sine_tutorial + local_sine_tutorial/local_sine_tutorial executor_forces_tutorial forces_gpu_tutorial gpcam_tutorial diff --git a/docs/utilities.rst b/docs/utilities.rst index 3c75dc9703..dbdc2dcb22 100644 --- a/docs/utilities.rst +++ b/docs/utilities.rst @@ -1,47 +1,49 @@ Convenience Tools and Functions =============================== -.. tab-set:: +Setup Helpers +------------- - .. tab-item:: Setup Helpers +.. automodule:: tools + :members: + :no-undoc-members: - .. automodule:: tools - :members: - :no-undoc-members: +Persistent Helpers +------------------ - .. tab-item:: Persistent Helpers +.. _p_gen_routines: - .. _p_gen_routines: +These routines are commonly used within persistent generator functions +such as ``persistent_aposmm`` in ``libensemble/gen_funcs/`` for intermediate +communication with the manager. Persistent simulator functions are also supported. - These routines are commonly used within persistent generator functions - such as ``persistent_aposmm`` in ``libensemble/gen_funcs/`` for intermediate - communication with the manager. Persistent simulator functions are also supported. +.. automodule:: persistent_support + :members: + :no-undoc-members: - .. automodule:: persistent_support - :members: - :no-undoc-members: +Allocation Helpers +------------------ - .. tab-item:: Allocation Helpers +These routines are used within custom allocation functions to help prepare ``Work`` +structures for workers. See the routines within ``libensemble/alloc_funcs/`` for +examples. - These routines are used within custom allocation functions to help prepare ``Work`` - structures for workers. See the routines within ``libensemble/alloc_funcs/`` for - examples. +.. automodule:: alloc_support + :members: + :no-undoc-members: - .. automodule:: alloc_support - :members: - :no-undoc-members: +Live Data +--------- - .. tab-item:: Live Data +These classes provide a means to capture and display data during a workflow run. +Users may provide an initialized object via ``libE_specs["live_data"]``. For example:: - These classes provide a means to capture and display data during a workflow run. - Users may provide an initialized object via ``libE_specs["live_data"]``. For example:: + from libensemble.tools.live_data.plot2n import Plot2N + libE_specs["live_data"] = Plot2N(plot_type='2d') - from libensemble.tools.live_data.plot2n import Plot2N - libE_specs["live_data"] = Plot2N(plot_type='2d') +.. automodule:: libensemble.tools.live_data.live_data + :members: - .. automodule:: libensemble.tools.live_data.live_data - :members: - - .. automodule:: plot2n - :members: Plot2N - :show-inheritance: +.. automodule:: plot2n + :members: Plot2N + :show-inheritance: diff --git a/docs/welcome.rst b/docs/welcome.rst index 9498fab1ac..01fdef4425 100644 --- a/docs/welcome.rst +++ b/docs/welcome.rst @@ -40,7 +40,7 @@ libEnsemble A complete toolkit for dynamic ensembles of calculations - New to libEnsemble? :doc:`Start here`. - - Try out libEnsemble with a :doc:`tutorial`. + - Try out libEnsemble with a :doc:`tutorial`. - Go in depth by reading the :doc:`full overview`. - See the :doc:`FAQ` for common questions and answers, errors, and resolutions. - Check us out on `GitHub`_. diff --git a/docs/xSDK_policy_compatibility.md b/docs/xSDK_policy_compatibility.md deleted file mode 100644 index 0dc83520d0..0000000000 --- a/docs/xSDK_policy_compatibility.md +++ /dev/null @@ -1,82 +0,0 @@ -# xSDK Community Policy Compatibility for libEnsemble - -This document summarizes the efforts of libEnsemble -to achieve compatibility with the xSDK community policies. - -**Website:** https://github.com/Libensemble/libensemble - -### Mandatory Policies - -[General libEnsemble Note](#liben-note) - -| Policy |Support| Notes | -|------------------------|-------|-------------------------| -|**M1.** Support xSDK community GNU Autoconf or CMake options. |N/A| libEnsemble is a Python package and provides a `setup.py` file for installation. This is compatible with Python's built-in installation feature (`python setup.py install`) and with the ubiquitous `pip` installer. libEnsemble is also in the Spack repository and can be installed with `spack install py-libensemble`. GNU Autoconf or CMake are unsuitable for a Python package. -|**M2.** Provide a comprehensive test suite for correctness of installation verification. |Full| libEnsemble has a test suite that includes both unit tests and regression tests that are run on every push to GitHub via [Travis CI](https://travis-ci.org/Libensemble/libensemble). In addition to this test suite, further scaling tests are manually run on HPC platforms including Cori, Theta, and Summit. -|**M3.** Employ user-provided MPI communicator (no MPI_COMM_WORLD). |Full|libEnsemble takes an MPI communicator as an option; if libEnsemble is configured for MPI mode, this provided communicator will be employed. If no communicator is given, a duplicate of MPI_COMM_WORLD is taken as a default. | -|**M4.** Give best effort at portability to key architectures (standard Linux distributions, GNU, Clang, vendor compilers, and target machines at ALCF, NERSC, OLCF). |Full| libEnsemble is tested regularly, including prior to every release, on ALCF (Theta), OLCF (Summit) and NERSC (Cori) platforms. [M4 details](#m4-details)| -|**M5.** Provide a documented, reliable way to contact the development team. |Full| The libEnsemble team can be contacted through: 1) The public [issues page on GitHub](https://github.com/Libensemble/libensemble/issues). 2) [Slack](https://libensemble.slack.com). 3) The public email list libensemble@mcs.anl.gov. | -|**M6.** Respect system resources and settings made by other previously called packages (e.g., signal handling). |Full| libEnsemble does not modify system resources or settings. | -|**M7.** Come with an open source (BSD style) license. |Full| libEnsemble uses a 3-clause BSD license stated in the `LICENSE` file in the top level of the GitHub repository. | -|**M8.** Provide a runtime API to return the current version number of the software. |Full| The version can be returned within Python via: `libensemble.__version__`| -|**M9.** Use a limited and well-defined symbol, macro, library, and include file name space. |Full| All libEnsemble symbols (e.g., functions, variables, modules, packages) begin with the prefix `libensemble.`. This prevents any namespace conflicts.| -|**M10.** Provide an xSDK team accessible repository (not necessarily publicly available). |Full| The libEnsemble repository is public and can be found at https://github.com/Libensemble/libensemble. Gitflow is used, along with pull requests, whereby only those with administrator privileges can accept pull requests into the master or develop branches. The workflow guidelines are provided in a `CONTRIBUTING.rst` file at the top level of the repository and a release process is given in the documentation. | -|**M11.** Have no hardwired print or IO statements that cannot be turned off. |Full| All output from the libEnsemble core package, except for the raising of exceptions, is routed through a libEnsemble logger, which is isolated from the Python root logger. Log messages of type `MANAGER_WARNING` or above are duplicated to standard error by default to ensure they are not missed. This can be turned off through the API. The API also allows the user to change the logging verbosity level and the name of the log file. This would allow a user, for example, to append logging to an existing log file, or to keep it separate. libEnsemble contains no interactive input. libEnsemble creates the files `ensemble.log` and `libE_stats.txt`, but the creation of these files can be preempted. [M11 details](#m11-details)| -|**M12.** For external dependencies, allow installing, building, and linking against an outside copy of external software. |Full| libEnsemble does not contain any other package's source code within. Note that Python packages are imported using the conventional `sys.path` system. Alternative instances of a package can be used by, for example, including in the `PYTHONPATH` environment variable.| -|**M13.** Install headers and libraries under \/include and \/lib. |Full| The standard Python installation is used for Python dependencies. This installs external Python packages under `/lib/python/site-packages/` When installed through Spack, the `` is specific to each Python package. This is added to `PYTHONPATH` when the Spack module for that library is loaded.| -|**M14.** Be buildable using 64 bit pointers. 32 bit is optional. |Full| There is no explicit use of pointers in libEnsemble, as Python handles pointers internally and depends on the install of Python (e.g., CPython), which will generally be 64-bit on supported systems. | -|**M15.** All xSDK compatibility changes should be sustainable. |Full| The xSDK-compatible package is in the standard release path. All the changes here should be sustainable. | -|**M16.** The package must support production-quality installation compatible with the xSDK install tool and xSDK metapackage. |Full|libEnsemble configure and install has full support from Spack. | - -M4 details : libEnsemble is a Python code and so does -not directly use compilers. It does, however, use NumPy, SciPy and mpi4py which -use compiled extensions. The current CI tests of libEnsemble use the standard -CPython compatible builds of these extensions (which are built using the GNU -compilers). libEnsemble is also regularly tested using the Intel distribution -for Python. - -libEnsemble is supported on Linux platforms and macOS. Windows platforms are -currently not supported. - -M11 details : Note: The sub-packages in the libensemble -directory structure such as `sim_specs` and `gen_specs` may contain print -statements. These are considered examples for users, rather than core -libEnsemble packages. - -A special exception exists in the `node_resources.py` module; part of -libEnsemble's resource detection infrastructure. The routine -`_print_local_cpu_resources()` can be launched by libEnsemble to probe -resources on a target node, and the output of this independent program is -captured by libEnsemble. - -### Recommended Policies - -| Policy |Support| Notes | -|------------------------|-------|-------------------------| -|**R1.** Have a public repository. |Full| Yes (see M10 above). | -|**R2.** Possible to run test suite under valgrind in order to test for memory corruption issues. |Full| It is possible to run the test suite under Valgrind. While libEnsemble is Python code, this may be useful for compiled extensions that are imported. PYTHONMALLOC=malloc must be set on the run line. CPython also provides a suppression file.| -|**R3.** Adopt and document consistent system for error conditions/exceptions. |Full| libEnsemble defines and raises exceptions according to module. All exceptions on workers are passed to the manager for processing. Warnings are handled by the logger. [R3 details](#r3-details)|| -|**R4.** Free all system resources acquired as soon as they are no longer needed. |Full| Python has built-in garbage collection that frees memory when it becomes unreferenced. When opening files, wherever possible, `with` expressions or `try/finally` blocks are used to ensure file handles are closed, even in the case of an error.| -|**R5.** Provide a mechanism to export ordered list of library dependencies. |Full| The dependencies for libEnsemble are given in `setup.py` and when pip install or pip setup.py egg_info are run, a file is created `libensemble.egg-info/requires.txt` containing the list of required and optional dependencies. If installing through pip, these will automatically be installed if they do not exist (`pip install libensemble` installs req. dependencies, while `pip install libensemble[extras]` installs both required and optional dependencies.| -|**R6.** Document versions of packages that it works with or depends upon, preferably in machine-readable form. |Full| Dependencies are given in the documentation. In some cases, this includes a lower bound on the version number. These dependencies are also specified in the Spack package, and automatically resolved during installation.| -|**R7.** Have README, SUPPORT, LICENSE, and CHANGELOG files in top directory. |Full| These files are present in the top directory.| - -R3 details : libEnsemble catches all exceptions -(explicitly raised and unexpected) from the manager and worker processes at the -libEnsemble level, resulting in libEnsemble dumping the key ensemble state to -files. In `mpi4py` mode, the default is to then call MPI_ABORT to prevent a -hang. However, this can be turned off (via the `libE_specs` argument). In the -case it is turned off, or if other communication modes are used, the exception -is then raised. The user can in turn catch these exceptions from their calling -script. - -libEnsemble Note : The nature of libEnsemble's -interoperability with other libraries is different from typical xSDK libraries. -libEnsemble is a Python code and interaction with other libraries may take -several forms. These include: libEnsemble calling other libraries through -Python bindings, libEnsemble launching applications (possibly providing a -sub-communicator), libEnsemble being called from a Python level infrastructure, -libEnsemble being launched as part of a campaign level workflow, or libEnsemble -potentially being activated via a system call or embedded interpreter; a more -unconventional approach. This is, therefore, a good opportunity to consider -interoperability from a Python and broader workflow perspective. diff --git a/examples/libE_submission_scripts/summit_submit_mproc.sh b/examples/libE_submission_scripts/summit_submit_mproc.sh deleted file mode 100644 index ba565f6c82..0000000000 --- a/examples/libE_submission_scripts/summit_submit_mproc.sh +++ /dev/null @@ -1,44 +0,0 @@ -#!/bin/bash -x -#BSUB -P -#BSUB -J libe_mproc -#BSUB -W 30 -#BSUB -nnodes 4 -#BSUB -alloc_flags "smt1" - -# Script to run libEnsemble using multiprocessing on launch nodes. -# Assumes Conda environment is set up. - -# To be run with central job management -# - Manager and workers run on launch node. -# - Workers submit tasks to the compute nodes in the allocation. - -# Name of calling script- -export EXE=libE_calling_script.py - -# Communication Method -export COMMS="--comms local" - -# Number of workers. -export NWORKERS="--nworkers 4" - -# Wallclock for libE. (allow clean shutdown) -export LIBE_WALLCLOCK=25 # Optional if pass to script - -# Name of Conda environment -export CONDA_ENV_NAME= - -# Need these if not already loaded -# module load python -# module load gcc/4.8.5 - -# Activate conda environment -export PYTHONNOUSERSITE=1 -. activate $CONDA_ENV_NAME - -# hash -d python # Check pick up python in conda env -hash -r # Check no commands hashed (pip/python...) - -# Launch libE -# python $EXE $NUM_WORKERS > out.txt 2>&1 # No args. All defined in calling script -# python $EXE $COMMS $NWORKERS > out.txt 2>&1 # If calling script is using parse_args() -python $EXE $LIBE_WALLCLOCK $COMMS $NWORKERS > out.txt 2>&1 # If calling script takes wall-clock as positional arg. diff --git a/libensemble/ensemble.py b/libensemble/ensemble.py index 88ff29b6de..24b47d72b0 100644 --- a/libensemble/ensemble.py +++ b/libensemble/ensemble.py @@ -31,7 +31,7 @@ class Ensemble: """ The primary object for a libEnsemble workflow. - Parses and validates settings, sets up logging, and maintains output. + Parses and validates settings and maintains output. .. dropdown:: Example :open: @@ -39,28 +39,31 @@ class Ensemble: .. code-block:: python :linenos: - import numpy as np + from gest_api.vocs import VOCS from libensemble import Ensemble - from libensemble.gen_funcs.sampling import latin_hypercube_sample + from libensemble.gen_classes.sampling import UniformSample from libensemble.sim_funcs.simple_sim import norm_eval - from libensemble.specs import ExitCriteria, GenSpecs, LibeSpecs, SimSpecs + from libensemble.specs import ExitCriteria, GenSpecs, SimSpecs + + sampling = Ensemble(parse_args=True) - libE_specs = LibeSpecs(nworkers=4) - sampling = Ensemble(libE_specs=libE_specs) sampling.sim_specs = SimSpecs( sim_f=norm_eval, inputs=["x"], outputs=[("f", float)], ) + + vocs = VOCS( + variables={"x": [-3, 3]}, + objectives={"f": "EXPLORE"}, + ) + + generator = UniformSample(vocs=vocs) + sampling.gen_specs = GenSpecs( - gen_f=latin_hypercube_sample, - outputs=[("x", float, (1,))], - user={ - "gen_batch_size": 50, - "lb": np.array([-3]), - "ub": np.array([3]), - }, + generator=generator, + batch_size=50, ) sampling.exit_criteria = ExitCriteria(sim_max=100) @@ -69,13 +72,6 @@ class Ensemble: sampling.run() sampling.save_output(__file__) - - Run the above example via ``python this_file.py``. - - Instead of using the libE_specs line, you can also use ``sampling = Ensemble(parse_args=True)`` - and run via ``python this_file.py -n 4`` (4 workers). The ``parse_args=True`` parameter - instructs the Ensemble class to read command-line arguments. - Configure by: .. dropdown:: Option 1: Providing parameters on instantiation @@ -84,13 +80,14 @@ class Ensemble: :linenos: from libensemble import Ensemble + from libensemble.specs import SimSpecs from my_simulator import sim_find_energy - sim_specs = { - "sim_f": sim_find_energy, - "in": ["x"], - "out": [("y", float)], - } + sim_specs = SimSpecs( + sim_f=sim_find_energy, + inputs=["x"], + outputs=[("y", float)], + ) experiment = Ensemble(sim_specs=sim_specs) @@ -99,7 +96,8 @@ class Ensemble: .. code-block:: python :linenos: - from libensemble import Ensemble, SimSpecs + from libensemble import Ensemble + from libensemble.specs import SimSpecs from my_simulator import sim_find_energy sim_specs = SimSpecs( @@ -115,25 +113,25 @@ class Ensemble: Parameters ---------- - sim_specs: :obj:`dict` or :class:`SimSpecs` + sim_specs: class:`SimSpecs` - Specifications for the simulation function + Specifications for the simulator function. - gen_specs: :obj:`dict` or :class:`GenSpecs`, Optional + gen_specs: class:`GenSpecs`, Optional - Specifications for the generator function + Specifications for the generator. - exit_criteria: :obj:`dict` or :class:`ExitCriteria`, Optional + exit_criteria: class:`ExitCriteria` - Tell libEnsemble when to stop a run + Tell libEnsemble when to stop a run. - libE_specs: :obj:`dict` or :class:`LibeSpecs`, Optional + libE_specs: class:`LibeSpecs`, Optional - Specifications for libEnsemble + Specifications for libEnsemble. - alloc_specs: :obj:`dict` or :class:`AllocSpecs`, Optional + alloc_specs: class:`AllocSpecs`, Optional - Specifications for the allocation function + Specifications for the allocation function. persis_info: :obj:`dict`, Optional @@ -142,12 +140,12 @@ class Ensemble: executor: :class:`Executor`, Optional - libEnsemble Executor instance for use within simulation or generator functions + libEnsemble Executor instance for use within simulator functions or generators. H0: `NumPy structured array `_, Optional A libEnsemble history to be prepended to this run's history - :ref:`(example)` + :ref:`(example)`. parse_args: bool, Optional @@ -159,24 +157,20 @@ class Ensemble: def __init__( self, - sim_specs: SimSpecs | dict | None = SimSpecs(), - gen_specs: GenSpecs | dict | None = GenSpecs(), - exit_criteria: ExitCriteria | dict | None = {}, - libE_specs: LibeSpecs | dict | None = LibeSpecs(), - alloc_specs: AllocSpecs | dict | None = AllocSpecs(), - persis_info: dict | None = {}, + sim_specs: SimSpecs = SimSpecs(), + gen_specs: GenSpecs = GenSpecs(), + exit_criteria: ExitCriteria = ExitCriteria(), + libE_specs: LibeSpecs = LibeSpecs(), + alloc_specs: AllocSpecs = AllocSpecs(), + persis_info: dict = {}, executor: Executor | None = None, H0: npt.NDArray | None = None, - parse_args: bool | None = False, + parse_args: bool = False, ): self.sim_specs = sim_specs self.gen_specs = gen_specs self.exit_criteria = exit_criteria - self._libE_specs: LibeSpecs | dict | None = None - if isinstance(libE_specs, dict): - self._libE_specs = LibeSpecs(**libE_specs) - else: - self._libE_specs = libE_specs + self._libE_specs: LibeSpecs = libE_specs self.alloc_specs = alloc_specs self.persis_info = persis_info self.executor = executor @@ -215,38 +209,98 @@ def _parse_args(self) -> tuple[int, bool, LibeSpecs]: return self.nworkers, self.is_manager, self._libE_specs - def ready(self) -> bool: - """Quickly verify that all necessary data has been provided""" - return all([i for i in [self.exit_criteria, self._libE_specs, self.sim_specs]]) + def ready(self) -> tuple[bool, list[str]]: + """Verify that all necessary data has been provided before calling :meth:`run`. + + Performs a pre-flight check on the ensemble configuration, covering: + + - A simulation callable (``sim_f`` or ``simulator``) is set on ``sim_specs``. + - At least one exit condition is configured on ``exit_criteria``. + - Workers are available (``nworkers > 0`` for local/threads/tcp comms, + or MPI comms is set, which infers workers from the MPI communicator). + - If both ``gen_specs`` and ``sim_specs`` use the classic field-name interface, + the generator output field names are a superset of the simulator input field names. + + Returns + ------- + tuple[bool, list[str]] + A 2-tuple of ``(is_ready, issues)``. + ``is_ready`` is ``True`` when all checks pass. + ``issues`` is a list of human-readable strings describing each problem found; + it is empty when ``is_ready`` is ``True``. + + Example + ------- + .. code-block:: python + + ok, issues = sampling.ready() + if not ok: + for issue in issues: + print(f" - {issue}") + """ + issues: list[str] = [] + + # --- sim_specs: a callable must be set --- + sim_callable = getattr(self.sim_specs, "sim_f", None) or getattr(self.sim_specs, "simulator", None) + if not sim_callable: + issues.append( + "sim_specs is missing a callable: set 'sim_f' (a function) or 'simulator' (a gest-api object)." + ) + + # --- exit_criteria: at least one stop condition must be set --- + ec = self.exit_criteria + if ec is None or not any( + getattr(ec, field, None) is not None for field in ("sim_max", "gen_max", "wallclock_max", "stop_val") + ): + issues.append( + "exit_criteria has no stop condition: set at least one of " + "'sim_max', 'gen_max', 'wallclock_max', or 'stop_val'." + ) + + # --- workers: must be determinable --- + comms = getattr(self._libE_specs, "comms", "mpi") + if comms in ("local", "threads", "tcp"): + if not self.nworkers: + issues.append( + f"libE_specs.comms is '{comms}' but 'nworkers' is not set. " + "Set 'libE_specs.nworkers' or pass '--nworkers N' on the command line." + ) + # For 'mpi', worker count is derived from the MPI communicator at runtime; no check needed here. + + # --- cross-spec field consistency (classic interface only) --- + gen_outputs = [f[0] for f in (getattr(self.gen_specs, "outputs", None) or [])] + sim_inputs = getattr(self.sim_specs, "inputs", None) or [] + if gen_outputs and sim_inputs: + missing = [field for field in sim_inputs if field not in gen_outputs] + if missing: + issues.append( + f"sim_specs.inputs requests field(s) {missing} that are not produced " + f"by gen_specs.outputs {gen_outputs}. Check that field names match." + ) + + return not issues, issues @property - def libE_specs(self) -> LibeSpecs | None: + def libE_specs(self) -> LibeSpecs: return self._libE_specs @libE_specs.setter def libE_specs(self, new_specs): - # We need to deal with libE_specs being specified as dict or class, and - # "not" overwrite the internal libE_specs["comms"]. - # Respect everything if libE_specs isn't set if not hasattr(self, "_libE_specs") or not self._libE_specs: - if isinstance(new_specs, dict): - self._libE_specs = LibeSpecs(**new_specs) - else: - self._libE_specs = new_specs + self._libE_specs = new_specs return # Cast new libE_specs temporarily to dict - if not isinstance(new_specs, dict): # exclude_defaults should only be enabled with Pydantic v2 - if new_specs.comms != "mpi" and new_specs.comms != self._libE_specs.comms: # passing in a non-default comms - raise ValueError(OVERWRITE_COMMS_WARN) - platform_specs_set = False - if new_specs.platform_specs != {}: # bugginess across Pydantic versions for recursively casting to dict - platform_specs_set = True - platform_specs = new_specs.platform_specs - new_specs = specs_dump(new_specs, exclude_none=True, exclude_defaults=True) - if platform_specs_set: - new_specs["platform_specs"] = specs_dump(platform_specs, exclude_none=True) + if new_specs.comms != "mpi" and new_specs.comms != self._libE_specs.comms: # passing in a non-default comms + raise ValueError(OVERWRITE_COMMS_WARN) + platform_specs_set = False + if new_specs.platform_specs != {}: # bugginess across Pydantic versions for recursively casting to dict + platform_specs_set = True + platform_specs = new_specs.platform_specs + new_specs = specs_dump(new_specs, exclude_none=True, exclude_defaults=True) + if platform_specs_set: + new_specs["platform_specs"] = specs_dump(platform_specs, exclude_none=True) # Unset "comms" if we already have a libE_specs that contains that field, that came from parse_args if new_specs.get("comms") and hasattr(self._libE_specs, "comms"): @@ -265,10 +319,10 @@ def run(self) -> tuple[npt.NDArray, dict, int]: Manager--worker intercommunications are parsed from the ``comms`` key of :ref:`libE_specs`. An MPI runtime is assumed by default - if ``--comms local`` wasn't specified on the command-line or in ``libE_specs``. + if ``-n N`` wasn't specified on the command-line or ``comms="local"`` in ``libE_specs``. If a MPI communicator was provided in ``libE_specs``, then each ``.run()`` call - will initiate intercommunications on a **duplicate** of that communicator. + will initiate on a **duplicate** of that communicator. Otherwise, a duplicate of ``COMM_WORLD`` will be used. Returns @@ -326,20 +380,19 @@ def nworkers(self, value): def save_output(self, basename: str, append_attrs: bool = True): """ Writes out History array and persis_info to files. - If using a workflow_dir, will place with specified filename in that directory. + If using a ``workflow_dir_path`` in ``libE_specs``, will place with specified filename in that directory. Parameters ---------- Format: ``_results_History_length=_evals=_ranks=`` - To have the filename be only the basename, set append_attrs=False + To have the filename be only the basename, set ``append_attrs=False`` Format: ``_results_History_length=_evals=_ranks=`` """ if self.is_manager: - if self._get_option("libE_specs", "workflow_dir_path"): - assert self.libE_specs is not None + if getattr(self.libE_specs, "workflow_dir_path", False): save_libE_output( self.H, self.persis_info, @@ -350,11 +403,3 @@ def save_output(self, basename: str, append_attrs: bool = True): ) else: save_libE_output(self.H, self.persis_info, basename, self.nworkers, append_attrs=append_attrs) - - def _get_option(self, specs, name): - """Gets a specs value, underlying spec is either a dict or a class""" - attr = getattr(self, specs) - if isinstance(attr, dict): - return attr.get(name) - else: - return getattr(attr, name) diff --git a/libensemble/executors/executor.py b/libensemble/executors/executor.py index 990ea2bc95..369308ada7 100644 --- a/libensemble/executors/executor.py +++ b/libensemble/executors/executor.py @@ -63,7 +63,7 @@ class ExecutorException(Exception): class TimeoutExpired(Exception): """Timeout exception raised when Timeout expires""" - def __init__(self, task: str, timeout: float) -> None: + def __init__(self, task: str, timeout: float | None) -> None: self.task = task self.timeout = timeout @@ -151,9 +151,9 @@ def __init__( self.stderr = stderr or self.name + ".err" self.workdir = workdir self.dry_run = dry_run - self.runline = None + self.runline: str | None = None self.run_attempts = 0 - self.env = {} + self.env: dict[str, str] = {} self.ngpus_req = 0 def reset(self) -> None: @@ -239,6 +239,7 @@ def _set_complete(self) -> None: self.state = "FINISHED" else: self.calc_task_timing() + assert self.process is not None self.errcode = self.process.returncode self.success = self.errcode == 0 self.state = "FINISHED" if self.success else "FAILED" @@ -254,6 +255,7 @@ def poll(self) -> None: return # Poll the task + assert self.process is not None poll = self.process.poll() if poll is None: self.state = "RUNNING" @@ -330,7 +332,7 @@ def done(self) -> bool: self.poll() return self.finished - def kill(self, wait_time: int = 60) -> None: + def kill(self, wait_time: int | None = 60) -> None: """Kills or cancels the supplied task Parameters @@ -426,11 +428,11 @@ def __init__(self) -> None: """ self.manager_signal = None - self.default_apps = {"sim": None, "gen": None} - self.apps = {} + self.default_apps: dict[str, Application | None] = {"sim": None, "gen": None} + self.apps: dict[str, Application] = {} self.wait_time = 60 - self.list_of_tasks = [] + self.list_of_tasks: list[Task] = [] self.workerID = None self.comm = None self.last_task = 0 @@ -448,12 +450,12 @@ def serial_setup(self): pass # To be overloaded @property - def sim_default_app(self) -> Application: + def sim_default_app(self) -> Application | None: """Returns the default simulation app""" return self.default_apps["sim"] @property - def gen_default_app(self) -> Application: + def gen_default_app(self) -> Application | None: """Returns the default generator app""" return self.default_apps["gen"] @@ -468,7 +470,7 @@ def get_app(self, app_name: str) -> Application: ) return app - def default_app(self, calc_type: str) -> Application: + def default_app(self, calc_type: str) -> Application | None: """Gets the default app for a given calc type""" app = self.default_apps.get(calc_type) jassert(calc_type in ["sim", "gen"], "Unrecognized calculation type", calc_type) @@ -541,10 +543,8 @@ def register_app( jassert(calc_type in self.default_apps, "Unrecognized calculation type", calc_type) self.default_apps[calc_type] = self.apps[app_name] - def manager_poll(self) -> int: + def manager_poll(self) -> int | None: """ - .. _manager_poll_label: - Polls for a manager signal The executor manager_signal attribute will be updated. @@ -552,12 +552,13 @@ def manager_poll(self) -> int: self.manager_signal = None # Reset + assert self.comm is not None # Check for messages; disregard anything but a stop signal if not self.comm.mail_flag(): - return + return None mtag, man_signal = self.comm.recv() if mtag != STOP_TAG: - return + return None # Process the signal and push back on comm (for now) self.manager_signal = man_signal @@ -580,8 +581,8 @@ def manager_kill_received(self) -> bool: def polling_loop( self, task: Task, timeout: int | None = None, delay: float = 0.1, poll_manager: bool = False ) -> int: - """Optional, blocking, generic task status polling loop. Operates until the task - finishes, times out, or is optionally killed via a manager signal. On completion, returns a + """Blocking, generic task status polling loop. Operates until the task + finishes, times out, or is killed via a manager signal. On completion, returns a presumptive :ref:`calc_status` integer. Useful for running an application via the Executor until it stops without monitoring its intermediate output. @@ -709,13 +710,13 @@ def submit( app_args: str | None = None, stdout: str | None = None, stderr: str | None = None, - dry_run: bool | None = False, - wait_on_start: bool | None = False, + dry_run: bool = False, + wait_on_start: bool = False, env_script: str | None = None, ) -> Task: """Create a new task and run as a local serial subprocess. - The created :class:`task` object is returned. + Returns :class:`task` object. Parameters ---------- @@ -758,6 +759,7 @@ def submit( The launched task object """ + app: Application | None = None if app_name is not None: app = self.get_app(app_name) elif calc_type is not None: @@ -765,6 +767,8 @@ def submit( else: raise ExecutorException("Either app_name or calc_type must be set") + assert app is not None + default_workdir = os.getcwd() task = Task(app, app_args, default_workdir, stdout, stderr, self.workerID, dry_run) diff --git a/libensemble/executors/mpi_executor.py b/libensemble/executors/mpi_executor.py index 4547753741..5a0190d5c4 100644 --- a/libensemble/executors/mpi_executor.py +++ b/libensemble/executors/mpi_executor.py @@ -1,9 +1,9 @@ """ This module launches and controls the running of MPI applications. -In order to create an MPI executor, the calling script should contain: +In order to create an MPI executor, the script should contain:: -.. code-block:: python + from libensemble.executors.mpi_executor import MPIExecutor exctr = MPIExecutor() @@ -17,7 +17,7 @@ import time import libensemble.utils.launcher as launcher -from libensemble.executors.executor import Executor, ExecutorException, Task +from libensemble.executors.executor import Application, Executor, ExecutorException, Task from libensemble.executors.mpi_runner import MPIRunner from libensemble.resources.mpi_resources import get_MPI_variant @@ -183,7 +183,7 @@ def _launch_with_retries( else: break - def submit( + def submit( # type: ignore[override] self, calc_type: str | None = None, app_name: str | None = None, @@ -196,18 +196,18 @@ def submit( stdout: str | None = None, stderr: str | None = None, stage_inout: str | None = None, - hyperthreads: bool | None = False, - dry_run: bool | None = False, - wait_on_start: bool | None = False, + hyperthreads: bool = False, + dry_run: bool = False, + wait_on_start: bool = False, extra_args: str | None = None, - auto_assign_gpus: bool | None = False, - match_procs_to_gpus: bool | None = False, + auto_assign_gpus: bool = False, + match_procs_to_gpus: bool = False, env_script: str | None = None, mpi_runner_type: str | dict | None = None, ) -> Task: """Creates a new task, and either executes or schedules execution. - The created :class:`task` object is returned. + Returns :class:`task` object. The user must supply either the app_name or calc_type arguments (app_name is recommended). All other arguments are optional. @@ -304,6 +304,7 @@ def submit( then the available resources will be divided among workers. """ + app: Application | None = None if app_name is not None: app = self.get_app(app_name) elif calc_type is not None: @@ -311,6 +312,8 @@ def submit( else: raise ExecutorException("Either app_name or calc_type must be set") + assert app is not None + default_workdir = os.getcwd() task = Task(app, app_args, default_workdir, stdout, stderr, self.workerID, dry_run) diff --git a/libensemble/generators.py b/libensemble/generators.py index a1927b6de6..15ae0725e4 100644 --- a/libensemble/generators.py +++ b/libensemble/generators.py @@ -220,7 +220,9 @@ def finalize(self) -> None: def export( self, vocs_field_names: bool = False, as_dicts: bool = False ) -> tuple[npt.NDArray | list | None, dict | None, int | None]: - """Return the generator's results + """ + Return the generator's results. + Parameters ---------- vocs_field_names : bool, optional @@ -229,6 +231,7 @@ def export( as_dicts : bool, optional If True, return local_H as list of dictionaries instead of numpy array. Default is False. + Returns ------- local_H : npt.NDArray | list diff --git a/libensemble/libE.py b/libensemble/libE.py index 9af1d52405..219e2cd8c4 100644 --- a/libensemble/libE.py +++ b/libensemble/libE.py @@ -189,7 +189,7 @@ def libE( libE_specs: :obj:`dict` or :class:`LibeSpecs`, Optional Specifications for libEnsemble - :doc:`(example)` + :doc:`(example)` H0: `NumPy structured array `_, Optional diff --git a/libensemble/resources/platforms.py b/libensemble/resources/platforms.py index 44b2e76b28..69c36242fc 100644 --- a/libensemble/resources/platforms.py +++ b/libensemble/resources/platforms.py @@ -230,16 +230,6 @@ class Polaris(Platform): scheduler_match_slots: bool = True -class Summit(Platform): - mpi_runner: str = "jsrun" - cores_per_node: int = 42 - logical_cores_per_node: int = 168 - gpus_per_node: int = 6 - gpu_setting_type: str = "option_gpus_per_task" - gpu_setting_name: str = "-g" - scheduler_match_slots: bool = False - - class Known_platforms(BaseModel): """A list of platforms with known configurations. @@ -287,7 +277,6 @@ class Known_platforms(BaseModel): perlmutter_c: PerlmutterCPU = PerlmutterCPU() perlmutter_g: PerlmutterGPU = PerlmutterGPU() polaris: Polaris = Polaris() - summit: Summit = Summit() # Dictionary of known systems (or system partitions) detectable by domain name @@ -295,7 +284,6 @@ class Known_platforms(BaseModel): "frontier.olcf.ornl.gov": "frontier", "hostmgmt.cm.aurora.alcf.anl.gov": "aurora", "hsn.cm.polaris.alcf.anl.gov": "polaris", - "summit.olcf.ornl.gov": "summit", # Need to detect gpu count } diff --git a/libensemble/specs.py b/libensemble/specs.py index 4ac858f5ad..2f33d447f3 100644 --- a/libensemble/specs.py +++ b/libensemble/specs.py @@ -88,8 +88,8 @@ class SimSpecs(BaseModel): simulator: object | None = None """ - A pre-initialized simulator object or callable in gest-api format. - When provided, sim_f defaults to gest_api_sim wrapper. + A callable (function) in gest-api format. + When provided, ``sim_f`` defaults to the ``gest_api_sim`` wrapper. """ inputs: list[str] | None = Field(default=[], alias="in") @@ -486,7 +486,7 @@ class LibeSpecs(BaseModel): ``False`` by default to protect results. """ - workflow_dir_path: str | Path | None = "." + workflow_dir_path: str | Path = "." """ Optional path to the workflow directory. """ @@ -682,7 +682,7 @@ def set_calc_dirs_on_input_dir(self): worker_cmd: list[str] | None = [] """ TCP Only: Split string corresponding to worker/client Python process invocation. Contains - a local Python path, calling script, and manager/server format-fields for ``manager_ip``, + a local Python path, user script, and manager/server format-fields for ``manager_ip``, ``manager_port``, ``authkey``, and ``workerID``. ``nworkers`` is specified normally. """ diff --git a/libensemble/tests/functionality_tests/test_mpi_gpu_settings.py b/libensemble/tests/functionality_tests/test_mpi_gpu_settings.py index f307ca01be..d83e7a13cd 100644 --- a/libensemble/tests/functionality_tests/test_mpi_gpu_settings.py +++ b/libensemble/tests/functionality_tests/test_mpi_gpu_settings.py @@ -52,7 +52,7 @@ # Import libEnsemble items for this test from libensemble.libE import libE -from libensemble.resources.platforms import Aurora, Frontier, PerlmutterGPU, Platform, Polaris, Summit +from libensemble.resources.platforms import Aurora, Frontier, PerlmutterGPU, Platform, Polaris from libensemble.sim_funcs import six_hump_camel from libensemble.sim_funcs.var_resources import gpu_variable_resources as sim_f from libensemble.tools import parse_args @@ -190,7 +190,7 @@ del libE_specs["platform_specs"] # Fourth set - use platform setting ------------------------------------------------------------ - for platform in ["summit", "frontier", "perlmutter_g", "polaris", "aurora"]: + for platform in ["frontier", "perlmutter_g", "polaris", "aurora"]: print(f"\nRunning GPU setting checks (via known platform) for {platform} ------------------- ") libE_specs["platform"] = platform @@ -206,7 +206,7 @@ del libE_specs["platform"] # Fifth set - use platform environment setting ----------------------------------------------- - for platform in ["summit", "frontier", "perlmutter_g", "polaris", "aurora"]: + for platform in ["frontier", "perlmutter_g", "polaris", "aurora"]: print(f"\nRunning GPU setting checks (via known platform env. variable) for {platform} ----- ") os.environ["LIBE_PLATFORM"] = platform @@ -222,7 +222,7 @@ del os.environ["LIBE_PLATFORM"] # Sixth set - use platform_specs with known systems ------------------------------------------- - for platform in [Summit, Frontier, PerlmutterGPU, Polaris, Aurora]: + for platform in [Frontier, PerlmutterGPU, Polaris, Aurora]: print(f"\nRunning GPU setting checks (via known platform - platform_specs) for {platform} ------------------- ") libE_specs["platform_specs"] = platform() diff --git a/libensemble/tests/scaling_tests/forces/forces_app/build_forces.sh b/libensemble/tests/scaling_tests/forces/forces_app/build_forces.sh index 8dd599ffed..972695850f 100755 --- a/libensemble/tests/scaling_tests/forces/forces_app/build_forces.sh +++ b/libensemble/tests/scaling_tests/forces/forces_app/build_forces.sh @@ -49,10 +49,3 @@ fi # Nvidia (nvc) compiler with mpicc and on Cray system with target (Perlmutter) # mpicc -DGPU -O3 -fopenmp -mp=gpu -o forces.x forces.c # cc -DGPU -Wl,-znoexecstack -O3 -fopenmp -mp=gpu -target-accel=nvidia80 -o forces.x forces.c - -# xl (plain and using mpicc on Summit) -# xlc_r -DGPU -O3 -qsmp=omp -qoffload -o forces.x forces.c -# mpicc -DGPU -O3 -qsmp=omp -qoffload -o forces.x forces.c - -# Summit with gcc (Need up to offload capable gcc: module load gcc/12.1.0) - slower than xlc -# mpicc -DGPU -Ofast -fopenmp -Wl,-rpath=/sw/summit/gcc/12.1.0-0/lib64 -lm -foffload=nvptx-none forces.c -o forces.x diff --git a/libensemble/tests/scaling_tests/forces/submission_scripts/summit_submit_mproc.sh b/libensemble/tests/scaling_tests/forces/submission_scripts/summit_submit_mproc.sh deleted file mode 100755 index 268ba64a36..0000000000 --- a/libensemble/tests/scaling_tests/forces/submission_scripts/summit_submit_mproc.sh +++ /dev/null @@ -1,52 +0,0 @@ -#!/bin/bash -x -#BSUB -P -#BSUB -J libe_mproc -#BSUB -W 20 -#BSUB -nnodes 4 -#BSUB -alloc_flags "smt1" - -# Script to run libEnsemble using multiprocessing on launch nodes. -# Assumes Conda environment is set up. - -# To be run with central job management -# - Manager and workers run on launch node. -# - Workers submit tasks to the nodes in the job available. - -# Name of calling script- -export EXE=run_libe_forces.py - -# Communication Method -export COMMS="--comms local" - -# Number of workers. -export NWORKERS="--nworkers 5" - -# Wallclock for libE. Slightly smaller than job wallclock -#export LIBE_WALLCLOCK=15 # Optional if pass to script - -# Name of Conda environment -export CONDA_ENV_NAME= - -export LIBE_PLOTS=true # Require plot scripts in $PLOT_DIR (see at end) -export PLOT_DIR=.. - -# Need these if not already loaded -# module load python -# module load gcc/4.8.5 - -# Activate conda environment -export PYTHONNOUSERSITE=1 -. activate $CONDA_ENV_NAME - -# hash -d python # Check pick up python in conda env -hash -r # Check no commands hashed (pip/python...) - -# Launch libE. -#python $EXE $NUM_WORKERS $LIBE_WALLCLOCK > out.txt 2>&1 -python $EXE $COMMS $NWORKERS > out.txt 2>&1 - -if [[ $LIBE_PLOTS = "true" ]]; then - python $PLOT_DIR/plot_libe_calcs_util_v_time.py - python $PLOT_DIR/plot_libe_tasks_util_v_time.py - python $PLOT_DIR/plot_libe_histogram.py -fi diff --git a/libensemble/tests/unit_tests/test_ensemble.py b/libensemble/tests/unit_tests/test_ensemble.py index 9e9946c3b5..0e5de32239 100644 --- a/libensemble/tests/unit_tests/test_ensemble.py +++ b/libensemble/tests/unit_tests/test_ensemble.py @@ -22,14 +22,14 @@ def test_ensemble_parse_args_false(): from libensemble.specs import LibeSpecs # Ensemble(parse_args=False) by default, so these specs won't be overwritten: - e = Ensemble(libE_specs={"comms": "local", "nworkers": 4}) + e = Ensemble(libE_specs=LibeSpecs(comms="local", nworkers=4)) assert hasattr(e, "nworkers"), "nworkers should've passed from libE_specs to Ensemble class" - assert isinstance(e.libE_specs, LibeSpecs), "libE_specs should've been cast to class" + assert isinstance(e.libE_specs, LibeSpecs), "libE_specs should be a LibeSpecs instance" - # test pass attribute as dict - e = Ensemble(libE_specs={"comms": "local", "nworkers": 4}) + # test passing a second instance + e = Ensemble(libE_specs=LibeSpecs(comms="local", nworkers=4)) assert hasattr(e, "nworkers"), "nworkers should've passed from libE_specs to Ensemble class" - assert isinstance(e.libE_specs, LibeSpecs), "libE_specs should've been cast to class" + assert isinstance(e.libE_specs, LibeSpecs), "libE_specs should be a LibeSpecs instance" # test that adjusting Ensemble.nworkers also changes libE_specs e.nworkers = 8 @@ -182,6 +182,94 @@ def test_local_comms_without_nworkers(): assert not flag, "'local' ensemble without nworkers should not be created" +def test_ready_missing_sim_callable(): + """ready() should flag a missing sim callable.""" + from libensemble.ensemble import Ensemble + from libensemble.specs import ExitCriteria, LibeSpecs, SimSpecs + + e = Ensemble( + libE_specs=LibeSpecs(comms="local", nworkers=4), + sim_specs=SimSpecs(), # no sim_f or simulator + exit_criteria=ExitCriteria(sim_max=10), + ) + ok, issues = e.ready() + assert not ok, "Should not be ready without a sim callable" + assert any("sim_f" in msg for msg in issues), f"Expected sim_f mention in issues: {issues}" + + +def test_ready_missing_exit_criteria(): + """ready() should flag an exit_criteria with no stop condition.""" + from libensemble.ensemble import Ensemble + from libensemble.sim_funcs.simple_sim import norm_eval + from libensemble.specs import ExitCriteria, LibeSpecs, SimSpecs + + e = Ensemble( + libE_specs=LibeSpecs(comms="local", nworkers=4), + sim_specs=SimSpecs(sim_f=norm_eval), + exit_criteria=ExitCriteria(), # nothing set + ) + ok, issues = e.ready() + assert not ok, "Should not be ready with no exit condition" + assert any("exit_criteria" in msg for msg in issues), f"Expected exit_criteria mention in issues: {issues}" + + +def test_ready_missing_nworkers_local(): + """ready() should flag local comms without nworkers.""" + from libensemble.ensemble import Ensemble + from libensemble.sim_funcs.simple_sim import norm_eval + from libensemble.specs import ExitCriteria, LibeSpecs, SimSpecs + + # Bypass the constructor ValueError by using mpi comms first, + # then patch to local after construction. + e = Ensemble( + libE_specs=LibeSpecs(comms="mpi"), + sim_specs=SimSpecs(sim_f=norm_eval), + exit_criteria=ExitCriteria(sim_max=10), + ) + # Manually force comms=local and nworkers=0 on the internal specs object + e._libE_specs.comms = "local" + e._nworkers = 0 + e._libE_specs.nworkers = 0 + + ok, issues = e.ready() + assert not ok, "Should not be ready with local comms and no nworkers" + assert any("nworkers" in msg for msg in issues), f"Expected nworkers mention in issues: {issues}" + + +def test_ready_field_mismatch(): + """ready() should flag when sim_specs.inputs requests fields not in gen_specs.outputs.""" + from libensemble.ensemble import Ensemble + from libensemble.sim_funcs.simple_sim import norm_eval + from libensemble.specs import ExitCriteria, GenSpecs, LibeSpecs, SimSpecs + + e = Ensemble( + libE_specs=LibeSpecs(comms="local", nworkers=4), + sim_specs=SimSpecs(sim_f=norm_eval, inputs=["x", "z"]), + gen_specs=GenSpecs(outputs=[("x", float, (1,))]), # missing "z" + exit_criteria=ExitCriteria(sim_max=10), + ) + ok, issues = e.ready() + assert not ok, "Should not be ready with mismatched gen/sim fields" + assert any("z" in msg for msg in issues), f"Expected missing field 'z' in issues: {issues}" + + +def test_ready_happy_path(): + """ready() should return (True, []) for a fully configured ensemble.""" + from libensemble.ensemble import Ensemble + from libensemble.sim_funcs.simple_sim import norm_eval + from libensemble.specs import ExitCriteria, GenSpecs, LibeSpecs, SimSpecs + + e = Ensemble( + libE_specs=LibeSpecs(comms="local", nworkers=4), + sim_specs=SimSpecs(sim_f=norm_eval, inputs=["x"], outputs=[("f", float)]), + gen_specs=GenSpecs(outputs=[("x", float, (1,))]), + exit_criteria=ExitCriteria(sim_max=10), + ) + ok, issues = e.ready() + assert ok, f"Should be ready but got issues: {issues}" + assert issues == [], f"Issues should be empty but got: {issues}" + + if __name__ == "__main__": test_ensemble_init() test_ensemble_parse_args_false() @@ -190,3 +278,8 @@ def test_local_comms_without_nworkers(): test_ensemble_specs_update_libE_specs() test_ensemble_prevent_comms_overwrite() test_local_comms_without_nworkers() + test_ready_missing_sim_callable() + test_ready_missing_exit_criteria() + test_ready_missing_nworkers_local() + test_ready_field_mismatch() + test_ready_happy_path() diff --git a/libensemble/tools/live_data/live_data.py b/libensemble/tools/live_data/live_data.py index 88d1cebcb2..7d50d75a8b 100644 --- a/libensemble/tools/live_data/live_data.py +++ b/libensemble/tools/live_data/live_data.py @@ -1,8 +1,6 @@ from abc import ABC, abstractmethod -from typing import TYPE_CHECKING -if TYPE_CHECKING: - import numpy.typing as npt +import numpy.typing as npt class LiveData(ABC): diff --git a/libensemble/tools/parse_args.py b/libensemble/tools/parse_args.py index f89504e6ba..9f52129d9b 100644 --- a/libensemble/tools/parse_args.py +++ b/libensemble/tools/parse_args.py @@ -149,7 +149,7 @@ def _client_parse_args(args): def parse_args(): """ - Parses command-line arguments. Use in calling script. + Parses command-line arguments. .. code-block:: python @@ -226,7 +226,7 @@ def parse_args(): libE_specs: :obj:`dict` Settings and specifications for libEnsemble - :doc:`(example)` + :doc:`(example)` """ args, misc_args = parser.parse_known_args(sys.argv[1:]) diff --git a/libensemble/tools/tools.py b/libensemble/tools/tools.py index ea7e76a8cc..859664f5d3 100644 --- a/libensemble/tools/tools.py +++ b/libensemble/tools/tools.py @@ -1,5 +1,5 @@ """ -The libEnsemble utilities module assists in writing consistent calling scripts +The libEnsemble utilities module assists in writing consistent top-level scripts and user functions. """ diff --git a/pixi.lock b/pixi.lock index 72332c8dda..3f15d4eec4 100644 --- a/pixi.lock +++ b/pixi.lock @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:60267c2464fb5cbe166f6a1310fb545e8a629e7ef3bbfe14bef3be3d75a852a4 -size 1018785 +oid sha256:8c1880c602bbe256e0015f88b5384e31d4808177aac9b6179aabc26d9cf546aa +size 1022855 diff --git a/pyproject.toml b/pyproject.toml index 3307bdb7bd..3909b16c0b 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -134,6 +134,7 @@ ipdb = ">=0.13.13,<0.14" mypy = ">=1.19.1,<2" types-psutil = ">=6.1.0.20241221,<7" types-pyyaml = ">=6.0.12.20250915,<7" +furo = ">=2025.12.19,<2026" [tool.pixi.tasks.build-docs] cmd = "cd docs && make html" @@ -243,7 +244,7 @@ extend-exclude = ["*.bib", "*.xml", "docs/nitpicky"] # Initial, permissive mypy configuration for libensemble. # Allows incremental adoption. To be tightened in future releases. packages = ["libensemble.utils"] -exclude = 'libensemble/utils/(launcher|loc_stack|runners|pydantic|output_directory)\.py$|libensemble/tests/(regression_tests|functionality_tests|unit_tests|scaling_tests)/.*' +exclude = 'docs/conf.py$|libensemble/utils/(launcher|loc_stack|runners|pydantic|output_directory)\.py$|libensemble/tests/(regression_tests|functionality_tests|unit_tests|scaling_tests)/.*' disable_error_code = ["import-not-found", "import-untyped"] ignore_missing_imports = true follow_imports = "skip"