diff --git a/docs/guide/protocols/absolutebinding.rst b/docs/guide/protocols/absolutebinding.rst index f35f573a9..9d7580a19 100644 --- a/docs/guide/protocols/absolutebinding.rst +++ b/docs/guide/protocols/absolutebinding.rst @@ -1,3 +1,5 @@ +.. _userguide_abfe_protocol: + Absolute Binding Protocol ========================= diff --git a/docs/guide/protocols/septop.rst b/docs/guide/protocols/septop.rst index bb2e449ba..bd660953d 100644 --- a/docs/guide/protocols/septop.rst +++ b/docs/guide/protocols/septop.rst @@ -1,3 +1,5 @@ +.. _userguide_septop_protocol: + Separated Topologies Protocol ============================= diff --git a/docs/guide/setup/chemical_systems_and_thermodynamic_cycles.rst b/docs/guide/setup/chemical_systems_and_thermodynamic_cycles.rst index c4983dc01..966bf8b77 100644 --- a/docs/guide/setup/chemical_systems_and_thermodynamic_cycles.rst +++ b/docs/guide/setup/chemical_systems_and_thermodynamic_cycles.rst @@ -1,55 +1,201 @@ .. _userguide_chemicalsystems_and_components: -Chemical Systems, Components and Thermodynamic Cycles +Components, Chemical Systems and Thermodynamic Cycles ===================================================== +This page describes the core building blocks used to define simulation states in openfe: +:class:`.Component`\s, which describe what is physically present in a system; +:class:`.ChemicalSystem`\s, which combine components into a complete end state; +and thermodynamic cycles, which connect end states via alchemical transformations. + +.. _userguide_components: + +Components +---------- + +Components are the composable building blocks that define the chemical +composition of a simulated system. Splitting a system into components serves three purposes: + +1. Alchemical transformations can be easily understood by comparing the differences in components. +2. Components can be reused to compose different systems. +3. :class:`.Protocol`\s can apply component-specific behaviour, e.g. different force fields per component. + + +Component types — overview +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. list-table:: + :header-rows: 1 + :widths: 25 30 45 + + * - Component + - Role + - Key notes + * - :class:`.ProteinComponent` + - Biological assembly + - Typically the contents of a PDB file. May include crystallographic waters and ions (defined as HETATM entries), + and disulfide bonds (defined via CONECT records). + * - :class:`.SmallMoleculeComponent` + - Ligands and cofactors + - Can optionally contain atomic partial charges. If present, those will be used in the simulation. + * - :class:`.SolventComponent` + - Abstract solvent definition + - Defines solvent conditions and ion concentration. Does **not** include coordinates or box vectors. Solvent is added by the protocol at runtime. + * - :class:`.SolvatedPDBComponent` + - Explicitly solvated system + - Includes atomic coordinates and box vectors. Solvent is already present, + the protocol does not add any further solvation. + * - :class:`.ProteinMembraneComponent` + - Protein-membrane complex + - Subclass of :class:`.SolvatedPDBComponent`. Includes protein, membrane, solvent, + and box vectors. Replaces :class:`.ProteinComponent` in membrane systems. + +.. _userguide_solvation_models: + +Abstract vs explicit solvation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +These two approaches are **mutually exclusive**: + +* **Abstract solvation** — use a :class:`.SolventComponent`. The protocol adds solvent + during system preparation. +* **Explicit solvation** — use a :class:`.SolvatedPDBComponent` or + :class:`.ProteinMembraneComponent`. Solvent molecule coordinates (waters and ions) are explicitly defined in the inputs. + +Either define the solvent abstractly, or provide a fully solvated system — do not mix +both for the same leg of a transformation. + +.. note:: + + Some protocols, such as :class:`.SepTopProtocol` and :class:`.AbsoluteBindingProtocol`, + use a single :class:`.ChemicalSystem` to represent both the complex and solvent legs. + In this case, a :class:`.ChemicalSystem` may contain both a :class:`.SolventComponent` + and a :class:`.ProteinMembraneComponent`. However, these apply to *different* legs: the + :class:`.SolventComponent` is used only for the solvent leg, and the + :class:`.ProteinMembraneComponent` (which is already explicitly solvated) is used only + for the complex leg. The mutual exclusivity rule still holds per leg. + +Box vectors for explicitly solvated systems +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The components :class:`.SolvatedPDBComponent` and :class:`.ProteinMembraneComponent` +require periodic box vectors. These can be provided in three ways: + +1. **CRYST record in the PDB file** — OpenMM reads box vectors automatically. No additional arguments are needed:: + + membrane_protein = openfe.ProteinMembraneComponent.from_pdb_file('./protein_membrane.pdb') + +2. **Manual specification** — box vectors can be provided explicitly as numpy arrays with OpenFF units in OpenMM format via the ``box_vectors`` argument:: + + import numpy as np + import openff.units as offunit + + box_vectors = np.array([ + [6.9587, 0.0, 0.0], + [0.0, 5.9164, 0.0], + [0.0, 0.0, 9.2692] + ]) * offunit.nanometer + + membrane_protein = openfe.ProteinMembraneComponent.from_pdb_file( + './protein_membrane.pdb', box_vectors=box_vectors + ) + +3. **Inference from atomic coordinates** — box vectors can be estimated from the atomic + positions by passing ``infer_box_vectors=True``:: + + membrane_protein = openfe.ProteinMembraneComponent.from_pdb_file( + './protein_membrane.pdb', infer_box_vectors=True + ) + + .. warning:: + + Inferring box vectors from atomic positions can be inaccurate if the PDB originates + from a previous simulation where atoms may be distributed across periodic images. + + .. _userguide_chemical_systems: -Chemical Systems ----------------- +ChemicalSystem +-------------- -A :class:`.ChemicalSystem` represents the end state of an alchemical transformation, -which can then be input to a :class:`.Protocol`. +A :class:`.ChemicalSystem` is composed of components that together describe a model of the system to be simulated. +simulated system. It represents the **end state** of an alchemical transformation +and is the primary input a :class:`.Protocol` consumes to define a simulation state. -A :class:`.ChemicalSystem` **does** contain the following information (when present): +**What a ChemicalSystem defines** -* exact atomic information (including protonation state) of protein, ligands, co-factors, and any crystallographic - waters -* atomic positions of all explicitly defined components such as ligands or proteins -* the abstract definition of the solvation environment, if present +* Exact atomic information (including protonation state) of protein, ligands, + cofactors, and any crystallographic waters. +* Atomic positions of all explicitly defined components such as ligands or proteins. +* The abstract or explicit definition of the solvent environment (SolventComponent). -A :class:`.ChemicalSystem` does **NOT** include the following: +**What a ChemicalSystem does NOT define**, and are instead handled by the Protocol: -* forcefield applied to any component, including details on water model or virtual particles -* thermodynamic conditions (e.g. temperature or pressure) +Any simulation parameters including: +* Forcefield applied to any component, including water model or virtual particles. +* Thermodynamic conditions (e.g. temperature or pressure). +* These are handled by the :class:`.Protocol`. -.. _userguide_components: +.. _userguide_system_composition: + +System composition examples +--------------------------- + +The components that make up each :class:`.ChemicalSystem` depend on the protocol and +the nature of the system. The table below summarises the composition for each combination. -Components ----------- -A :class:`.ChemicalSystem` is composed of many 'component' objects, which together define overall system. +.. note:: -Examples of components include: + Protocol-specific behaviour: + For :class:`.SepTopProtocol` and :class:`.AbsoluteBindingProtocol`, a single + :class:`.ChemicalSystem` represents both legs of the thermodynamic cycle. The protocol + determines internally what is the complex leg and what is the solvent leg. + This differs from the :class:`.RelativeHybridTopologyProtocol`, where each leg (e.g. complex and solvent) is defined by + separate :class:`.ChemicalSystem`\s. This behaviour is expected to change in future versions. -* :class:`.ProteinComponent`: an entire biological assembly, typically the contents of a PDB file. -* :class:`.SmallMoleculeComponent`: typically ligands and cofactors -* :class:`.SolventComponent`: solvent conditions +.. list-table:: + :header-rows: 1 + :widths: 20 40 40 -Splitting the total system into components serves three purposes: + * - System + - :ref:`RBFE ` (:class:`.RelativeHybridTopologyProtocol`) + - :ref:`SepTop ` / :ref:`ABFE ` (:class:`.SepTopProtocol`, :class:`.AbsoluteBindingProtocol`) + * - **Standard protein–ligand** + - | **Complex leg:** + | :class:`.ProteinComponent` + :class:`.SmallMoleculeComponent`\s + :class:`.SolventComponent` + | + | **Solvent leg:** + | :class:`.SmallMoleculeComponent`\s + :class:`.SolventComponent` + - | **Single ChemicalSystem (both legs):** + | :class:`.ProteinComponent` + :class:`.SmallMoleculeComponent`\s + :class:`.SolventComponent` + * - **Membrane system** + - | **Complex leg:** + | :class:`.ProteinMembraneComponent` + :class:`.SmallMoleculeComponent`\s + | *(no* :class:`.SolventComponent` *— already explicitly solvated)* + | + | **Solvent leg:** + | :class:`.SmallMoleculeComponent`\s + :class:`.SolventComponent` + - | **Single ChemicalSystem (both legs):** + | :class:`.ProteinMembraneComponent` + :class:`.SmallMoleculeComponent`\s + :class:`.SolventComponent` + | *(protocol applies* :class:`.SolventComponent` *only in the solvent leg)* -1. alchemical transformations can be easily understood by comparing the differences in components. -2. components can be reused to compose different systems. -3. :class:`.Protocol`\s can have component-specific behavior. E.g. different force fields for each component. Thermodynamic Cycles -------------------- -We can now describe a thermodynamic cycle as a set of :class:`.ChemicalSystem`\s. -The exact end states to construct are detailed in the :ref:`pages for each specific Protocol `. +A thermodynamic cycle can be described as a set of :class:`.ChemicalSystem`\s (nodes) connected by +alchemical transformations (edges). The :class:`.Protocol` defines how the +:class:`.ChemicalSystem`\s map onto the cycle and how they are used in practice. +The same :class:`.ChemicalSystem` can be reused across multiple thermodynamic states +depending on the protocol. For details of which end states to construct, consult the +:ref:`pages for each specific Protocol `. -As an example, we can construct the classic relative binding free energy cycle by defining four components: two ligands, -a protein, and a solvent: +Hybrid topology RBFE example +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As an example, the relative binding free energy cycle requires four +:class:`.ChemicalSystem`\s — one for each node in the cycle: .. figure:: ../protocols/img/rbfe_thermocycle.png :scale: 40% @@ -75,12 +221,27 @@ a protein, and a solvent: ligand_B_complex = openfe.ChemicalSystem(components={'ligand': ligand_B, 'protein': protein, 'solvent': solvent}) # ligand_A + solvent ligand_A_solvent = openfe.ChemicalSystem(components={'ligand': ligand_A, 'solvent': solvent}) - # ligand_A + solvent + # ligand_B + solvent ligand_B_solvent = openfe.ChemicalSystem(components={'ligand': ligand_B, 'solvent': solvent}) +Explicitly solvated variant +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When using a :class:`.SolvatedPDBComponent` or :class:`.ProteinMembraneComponent`, replace :class:`.ProteinComponent` +and :class:`.SolventComponent` for the complex leg. No separate :class:`.SolventComponent` +is required: + +:: + + # explicitly solvated protein-membrane complex (box vectors read from CRYST1 record) + protein_membrane = openfe.ProteinMembraneComponent.from_pdb_file('./protein_membrane.pdb') + + # ligand_A + explicitly solvated protein-membrane — no SolventComponent needed + ligand_A_complex = openfe.ChemicalSystem(components={'ligand': ligand_A, 'protein_membrane': protein_membrane}) + See Also -------- -* To see how to construct a :class:`.ChemicalSystem` \s from your files, see :ref:`the cookbook entry on loading molecules ` -* For details of what thermodynamic cycles to construct, consult the :ref:`pages for each specific Protocol ` +* To see how to construct a :class:`.ChemicalSystem` from your files, see :ref:`the cookbook entry on loading molecules ` +* For details of which thermodynamic cycles to construct, consult the :ref:`pages for each specific Protocol ` diff --git a/docs/guide/setup/defining_protocols.rst b/docs/guide/setup/defining_protocols.rst index e1d4a39b4..700f17f1f 100644 --- a/docs/guide/setup/defining_protocols.rst +++ b/docs/guide/setup/defining_protocols.rst @@ -10,6 +10,8 @@ there are multiple available ``Protocol``\s to choose from. For example, included in the ``openfe`` package are the following: * :class:`.RelativeHybridTopologyProtocol` + * :class:`.AbsoluteBindingProtocol` + * :class:`.SepTopProtocol` * :class:`.AbsoluteSolvationProtocol` * :class:`.PlainMDProtocol` @@ -47,6 +49,40 @@ For example, to customise the production run length of the RFE Protocol:: protocol = openmm_rfe.RelativeHybridTopologyProtocol(settings) +Adaptive Settings +~~~~~~~~~~~~~~~~~ + +.. warning:: + + The ``_adaptive_settings()`` method is experimental and subject to change. + +In addition to the ``.default_settings()`` method, some protocols +provide an ``_adaptive_settings`` method. This method generates recommended settings +based on properties of the input :class:`.ChemicalSystem`\s and, where required, the :class:`.AtomMapping`. + +For example:: + + from openfe.protocols import openmm_rfe + + settings = openmm_rfe.RelativeHybridTopologyProtocol._adaptive_settings( + stateA=stateA, + stateB=stateB, + mapping=mapping, + ) + + protocol = openmm_rfe.RelativeHybridTopologyProtocol(settings) + +The adaptive settings may modify parameters based on properties of the input systems. +For example (:class:`.RelativeHybridTopologyProtocol`): + +* Transformations involving a change in net charge use a larger number of lambda windows and longer production simulations. +* If both states contain a :class:`.ProteinComponent`, the solvation padding is set to 1 nm. + +Optionally, you can pass a preexisting settings object to the ``_adaptive_settings`` method via the ``initial_settings`` argument. If provided, an adapted copy of these settings will be returned instead +of using the default settings. + +In systems containing membrane-protein complexes (i.e. using a +:class:`.ProteinMembraneComponent`), adaptive settings select a membrane-appropriate barostat, the ``MonteCarloMembraneBarostat``. Creating Transformations from Protocols ----------------------------------------- diff --git a/news/user_guide_membranes.rst b/news/user_guide_membranes.rst new file mode 100644 index 000000000..0234c32eb --- /dev/null +++ b/news/user_guide_membranes.rst @@ -0,0 +1,23 @@ +**Added:** + +* + +**Changed:** + +* Updated the chemical systems user guide and the defining protocols user guide to reflect recent protocol updates, including adding membrane support. + +**Deprecated:** + +* + +**Removed:** + +* + +**Fixed:** + +* + +**Security:** + +*