Trajectories

In MDAnalysis, static data is contained in your universe Topology, while dynamic data is drawn from its trajectory at Universe.trajectory. This is typically loaded from a trajectory file and includes information such as:

  • atom coordinates (Universe.atoms.positions)

  • box size (Universe.dimensions)

  • velocities and forces (if your file format contains the data) (Universe.atoms.velocities)

Although these properties look static, they are actually dynamic, and the data contained within can change. In order to remain memory-efficient, MDAnalysis does not load every frame of your trajectory into memory at once. Instead, a Universe has a state: the particular timestep that it is currently associated with in the trajectory. When the timestep changes, the data in the properties above shifts accordingly.

The typical way to change a timestep is to index it. Universe.trajectory can be thought of as a list of Timesteps, a data structure that holds information for the current time frame. For example, you can query its length.

In [1]: import MDAnalysis as mda

In [2]: from MDAnalysis.tests.datafiles import PSF, DCD
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-2f9c58525a71> in <module>
----> 1 from MDAnalysis.tests.datafiles import PSF, DCD

~/checkouts/readthedocs.org/user_builds/mdauserguide/conda/rms-1.0/lib/python3.8/site-packages/MDAnalysis/tests/datafiles.py in <module>
     41 
     42 try:
---> 43     from MDAnalysisTests.datafiles import *
     44 except ImportError:
     45     print("*** ERROR ***")

~/checkouts/readthedocs.org/user_builds/mdauserguide/conda/rms-1.0/lib/python3.8/site-packages/MDAnalysisTests/__init__.py in <module>
    124 try:
    125     import matplotlib
--> 126     matplotlib.use('agg', warn=False)
    127 except ImportError:
    128     pass

TypeError: use() got an unexpected keyword argument 'warn'

In [3]: u = mda.Universe(PSF, DCD)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-3-62bb8796b82b> in <module>
----> 1 u = mda.Universe(PSF, DCD)

NameError: name 'PSF' is not defined

In [4]: len(u.trajectory)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-4-416bf3727284> in <module>
----> 1 len(u.trajectory)

NameError: name 'u' is not defined

When a trajectory is first loaded from a file, it is set to the first frame (with index 0), by default.

In [5]: print(u.trajectory.ts, u.trajectory.time)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-5-f8a79cad543b> in <module>
----> 1 print(u.trajectory.ts, u.trajectory.time)

NameError: name 'u' is not defined

Indexing the trajectory returns the timestep for that frame, and sets the Universe to point to that frame until the timestep next changes.

In [6]: u.trajectory[3]
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-6-558df155f0bc> in <module>
----> 1 u.trajectory[3]

NameError: name 'u' is not defined
In [7]: print('Time of fourth frame', u.trajectory.time)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-c138aab9df5c> in <module>
----> 1 print('Time of fourth frame', u.trajectory.time)

NameError: name 'u' is not defined

Many tasks involve applying a function to each frame of a trajectory. For these, you need to iterate through the frames, even if you don’t directly use the timestep. This is because the act of iterating moves the Universe onto the next frame, changing the dynamic atom coordinates.

Trajectories can also be sliced if you only want to work on a subset of frames.

In [8]: protein = u.select_atoms('protein')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-8-61a96949b257> in <module>
----> 1 protein = u.select_atoms('protein')

NameError: name 'u' is not defined

In [9]: for ts in u.trajectory[:20:4]:
   ...:     rad = protein.radius_of_gyration()
   ...:     print('frame={}: radgyr={}'.format(ts.frame, rad))
   ...: 
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-9-c418d7e825ae> in <module>
----> 1 for ts in u.trajectory[:20:4]:
      2     rad = protein.radius_of_gyration()
      3     print('frame={}: radgyr={}'.format(ts.frame, rad))

NameError: name 'u' is not defined

Note that after iterating over the trajectory, the frame is always set back to the first frame, even if your loop stopped before the trajectory end.

In [10]: u.trajectory.frame
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-10-c0cd53eaa961> in <module>
----> 1 u.trajectory.frame

NameError: name 'u' is not defined

Because MDAnalysis will pull trajectory data directly from the file it is reading from, changes to atom coordinates and box dimensions will not persist once the frame is changed. The only way to make these changes permanent is to load the trajectory into memory, or to write a new trajectory to file for every frame. For example, to set a cubic box size for every frame and write it out to a file:

with mda.Writer('with_box.trr', 'w', n_atoms=u.atoms.n_atoms) as w:
    for ts in u.trajectory:
        ts.dimensions = [10, 10, 10, 90, 90, 90]
        w.write(u.atoms)

u_with_box = mda.Universe(PSF, 'with_box.trr')

Sometimes you may wish to only transform part of the trajectory, or to not write a file out. In these cases, MDAnalysis supports “on-the-fly” transformations that are performed on a frame when it is read.