Trajectories¶
In MDAnalysis, static data is contained in your universe Topology, while dynamic data is drawn from its trajectory at Universe.trajectory
. This is typically loaded from a trajectory file and includes information such as:
atom coordinates (
Universe.atoms.positions
)box size (
Universe.dimensions
)velocities and forces (if your file format contains the data) (
Universe.atoms.velocities
)
Although these properties look static, they are actually dynamic, and the data contained within can change. In order to remain memory-efficient, MDAnalysis does not load every frame of your trajectory into memory at once. Instead, a Universe has a state: the particular timestep that it is currently associated with in the trajectory. When the timestep changes, the data in the properties above shifts accordingly.
The typical way to change a timestep is to index it. Universe.trajectory
can be thought of as a list of Timestep
s, a data structure that holds information for the current time frame. For example, you can query its length.
In [1]: import MDAnalysis as mda
In [2]: from MDAnalysis.tests.datafiles import PSF, DCD
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-2f9c58525a71> in <module>
----> 1 from MDAnalysis.tests.datafiles import PSF, DCD
~/checkouts/readthedocs.org/user_builds/mdauserguide/conda/rms-1.0/lib/python3.8/site-packages/MDAnalysis/tests/datafiles.py in <module>
41
42 try:
---> 43 from MDAnalysisTests.datafiles import *
44 except ImportError:
45 print("*** ERROR ***")
~/checkouts/readthedocs.org/user_builds/mdauserguide/conda/rms-1.0/lib/python3.8/site-packages/MDAnalysisTests/__init__.py in <module>
124 try:
125 import matplotlib
--> 126 matplotlib.use('agg', warn=False)
127 except ImportError:
128 pass
TypeError: use() got an unexpected keyword argument 'warn'
In [3]: u = mda.Universe(PSF, DCD)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-3-62bb8796b82b> in <module>
----> 1 u = mda.Universe(PSF, DCD)
NameError: name 'PSF' is not defined
In [4]: len(u.trajectory)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-4-416bf3727284> in <module>
----> 1 len(u.trajectory)
NameError: name 'u' is not defined
When a trajectory is first loaded from a file, it is set to the first frame (with index 0), by default.
In [5]: print(u.trajectory.ts, u.trajectory.time)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-5-f8a79cad543b> in <module>
----> 1 print(u.trajectory.ts, u.trajectory.time)
NameError: name 'u' is not defined
Indexing the trajectory returns the timestep for that frame, and sets the Universe to point to that frame until the timestep next changes.
In [6]: u.trajectory[3]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-6-558df155f0bc> in <module>
----> 1 u.trajectory[3]
NameError: name 'u' is not defined
In [7]: print('Time of fourth frame', u.trajectory.time)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-7-c138aab9df5c> in <module>
----> 1 print('Time of fourth frame', u.trajectory.time)
NameError: name 'u' is not defined
Many tasks involve applying a function to each frame of a trajectory. For these, you need to iterate through the frames, even if you don’t directly use the timestep. This is because the act of iterating moves the Universe onto the next frame, changing the dynamic atom coordinates.
Trajectories can also be sliced if you only want to work on a subset of frames.
In [8]: protein = u.select_atoms('protein')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-8-61a96949b257> in <module>
----> 1 protein = u.select_atoms('protein')
NameError: name 'u' is not defined
In [9]: for ts in u.trajectory[:20:4]:
...: rad = protein.radius_of_gyration()
...: print('frame={}: radgyr={}'.format(ts.frame, rad))
...:
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-9-c418d7e825ae> in <module>
----> 1 for ts in u.trajectory[:20:4]:
2 rad = protein.radius_of_gyration()
3 print('frame={}: radgyr={}'.format(ts.frame, rad))
NameError: name 'u' is not defined
Note that after iterating over the trajectory, the frame is always set back to the first frame, even if your loop stopped before the trajectory end.
In [10]: u.trajectory.frame
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-10-c0cd53eaa961> in <module>
----> 1 u.trajectory.frame
NameError: name 'u' is not defined
Because MDAnalysis will pull trajectory data directly from the file it is reading from, changes to atom coordinates and box dimensions will not persist once the frame is changed. The only way to make these changes permanent is to load the trajectory into memory, or to write a new trajectory to file for every frame. For example, to set a cubic box size for every frame and write it out to a file:
with mda.Writer('with_box.trr', 'w', n_atoms=u.atoms.n_atoms) as w:
for ts in u.trajectory:
ts.dimensions = [10, 10, 10, 90, 90, 90]
w.write(u.atoms)
u_with_box = mda.Universe(PSF, 'with_box.trr')
Sometimes you may wish to only transform part of the trajectory, or to not write a file out. In these cases, MDAnalysis supports “on-the-fly” transformations that are performed on a frame when it is read.