On-the-fly transformations

An on-the-fly transformation is a function that silently modifies the dynamic data contained in a trajectory Timestep (typically coordinates) as it is loaded into memory. It is called for each current time step to transform data into your desired representation. A transformation function must also return the current Timestep, as transformations are often chained together.

The MDAnalysis.transformations module contains a collection of transformations. For example, fit_rot_trans() can perform a mass-weighted alignment on an AtomGroup to a reference.

In [1]: import MDAnalysis as mda

In [2]: from MDAnalysis.tests.datafiles import TPR, XTC
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-3cec83db3cb9> in <module>
----> 1 from MDAnalysis.tests.datafiles import TPR, XTC

~/checkouts/readthedocs.org/user_builds/mdauserguide/conda/rms-1.0/lib/python3.8/site-packages/MDAnalysis/tests/datafiles.py in <module>
     41 
     42 try:
---> 43     from MDAnalysisTests.datafiles import *
     44 except ImportError:
     45     print("*** ERROR ***")

~/checkouts/readthedocs.org/user_builds/mdauserguide/conda/rms-1.0/lib/python3.8/site-packages/MDAnalysisTests/__init__.py in <module>
    124 try:
    125     import matplotlib
--> 126     matplotlib.use('agg', warn=False)
    127 except ImportError:
    128     pass

TypeError: use() got an unexpected keyword argument 'warn'

In [3]: from MDAnalysis import transformations as trans

In [4]: u = mda.Universe(TPR, XTC)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-4-b98a46cb93a1> in <module>
----> 1 u = mda.Universe(TPR, XTC)

NameError: name 'TPR' is not defined

In [5]: protein = u.select_atoms('protein')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-5-61a96949b257> in <module>
----> 1 protein = u.select_atoms('protein')

NameError: name 'u' is not defined

In [6]: align_transform = trans.fit_rot_trans(protein, protein, weights='mass')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-6-fc69f75e9424> in <module>
----> 1 align_transform = trans.fit_rot_trans(protein, protein, weights='mass')

NameError: name 'protein' is not defined

In [7]: u.trajectory.add_transformations(align_transform)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-ad5471cac940> in <module>
----> 1 u.trajectory.add_transformations(align_transform)

NameError: name 'u' is not defined

Other implemented transformations include functions to translate, rotate, fit an AtomGroup to a reference, and wrap or unwrap groups in the unit cell.

Although you can only call add_transformations() once, you can pass in multiple transformations in a list, which will be executed in order. For example, the below workflow:

  • makes all molecules whole (unwraps them over periodic boundary conditions)

  • centers the protein in the center of the box

  • wraps water back into the box

# create new Universe for new transformations
In [8]: u = mda.Universe(TPR, XTC)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-8-b98a46cb93a1> in <module>
----> 1 u = mda.Universe(TPR, XTC)

NameError: name 'TPR' is not defined

In [9]: protein = u.select_atoms('protein')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-9-61a96949b257> in <module>
----> 1 protein = u.select_atoms('protein')

NameError: name 'u' is not defined

In [10]: water = u.select_atoms('resname SOL')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-10-2f4b019acc4f> in <module>
----> 1 water = u.select_atoms('resname SOL')

NameError: name 'u' is not defined

In [11]: workflow = [trans.unwrap(u.atoms),
   ....:             trans.center_in_box(protein, center='geometry'),
   ....:             trans.wrap(water, compound='residues')]
   ....: 
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-11-3243fedf8ec4> in <module>
----> 1 workflow = [trans.unwrap(u.atoms),
      2             trans.center_in_box(protein, center='geometry'),
      3             trans.wrap(water, compound='residues')]

NameError: name 'u' is not defined

In [12]: u.trajectory.add_transformations(*workflow)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-12-4514ec374be2> in <module>
----> 1 u.trajectory.add_transformations(*workflow)

NameError: name 'u' is not defined

If your transformation does not depend on something within the Universe (e.g. a chosen AtomGroup), you can also create a Universe directly with transformations. The code below translates coordinates 1 angstrom up on the z-axis:

In [13]: u = mda.Universe(TPR, XTC, transformations=[trans.translate([0, 0, 1])])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-13-42a0c3e66589> in <module>
----> 1 u = mda.Universe(TPR, XTC, transformations=[trans.translate([0, 0, 1])])

NameError: name 'TPR' is not defined

If you need a different transformation, it is easy to implement your own.

Custom transformations

At its core, a transformation function must only take a Timestep as its input and return the Timestep as the output.

In [14]: def up_by_2(ts):
   ....:     """Translates atoms up by 2 angstrom"""
   ....:     ts.positions += np.array([0.0, 0.0, 0.2])
   ....:     return ts
   ....: 

In [15]: u = mda.Universe(TPR, XTC, transformations=[up_by_2])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-15-15fc33439d01> in <module>
----> 1 u = mda.Universe(TPR, XTC, transformations=[up_by_2])

NameError: name 'TPR' is not defined

If your transformation needs other arguments, you will need to wrap your core transformation with a wrapper function that can accept the other arguments.

In [16]: def up_by_x(x):
   ....:     """Translates atoms up by x angstrom"""
   ....:     def wrapped(ts):
   ....:         """Handles the actual Timestep"""
   ....:         ts.positions += np.array([0.0, 0.0, float(x)])
   ....:         return ts
   ....:     return wrapped
   ....: 

# load Universe with transformations that move it up by 7 angstrom
In [17]: u = mda.Universe(TPR, XTC, transformations=[up_by_x(5), up_by_x(2)])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-17-847dbb3d47ac> in <module>
----> 1 u = mda.Universe(TPR, XTC, transformations=[up_by_x(5), up_by_x(2)])

NameError: name 'TPR' is not defined

Alternatively, you can use functools.partial() to substitute the other arguments.

In [18]: import functools

In [19]: def up_by_x(ts, x):
   ....:     ts.positions += np.array([0.0, 0.0, float(x)])
   ....:     return ts
   ....: 

In [20]: up_by_5 = functools.partial(up_by_x, x=5)

In [21]: u = mda.Universe(TPR, XTC, transformations=[up_by_5])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-21-c9d7b0067711> in <module>
----> 1 u = mda.Universe(TPR, XTC, transformations=[up_by_5])

NameError: name 'TPR' is not defined

On-the-fly transformation functions can be applied to any property of a Timestep, not just the atom positions. For example, to give each frame of a trajectory a box:

In [22]: def set_box(ts):
   ....:     ts.dimensions = [10, 20, 30, 90, 90, 90]
   ....:     return ts
   ....: 

In [23]: u = mda.Universe(TPR, XTC, transformations=[set_box])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-23-cf528ffe0c1a> in <module>
----> 1 u = mda.Universe(TPR, XTC, transformations=[set_box])

NameError: name 'TPR' is not defined