Atom Specification Language (ASL)

Why an atom specification language?

The Hierarchy

entry > molecule > chain > residue > atom  

The atom specification language is made up of five classes. Each is listed below [with their minimum acceptable abbreviations shown outside the square brackets]

Each class is optional. If absent all entities of that type are matched.

Atom Specification

A complete specification is a class name and some property specified by a property name and property list. The syntax is:

class.property propertylist

All names of properties and characters in property lists are treated in a case insensitive manner. Wildcards are supported for atom and set names. A '*' matches zero or more characters and a '?' matches any single character. You can include comments in a specification by placing a '#' character before the text you wish to hide.

Items in a property list may be separated by comma, white space or both. Ranges (lower-upper) may be used where appropriate. Unterminated ranges are taken to include all available numbers. For example if there are four molecules in the system then the specifications:

mol. 2, 3, 4
mol. >=2
mol. 2-4
mol. >1

are equivalent. In a similar manner, the following specifications are equivalent.

mol. 1, 2, 3
mol. <=3
mol. 1-3
mol. <4

There are predefined ASL labels for class and property designations. The labels, which are typically just the words for the actual values they represent, can be abbreviated. For example the ASL expression atom.ptype could be abbreviated a.pt.

The standard ASL class and property designations are listed below. All labels shown below have their minimum acceptable abbreviations shown outside the square brackets:

e[ntry]

m[olecule]

c[hain]

This class designation allows you to specify atoms using chain attributes.

r[esidue]

This ASL class designation allows you to specify atoms based on residue properties. Combine with one of the following property specifications.

a[tom]

The atom class designator allows you to specify atoms according to their characteristics. Use this designator in combination with one of the properties described below. Note that property lists containing either ptypes or numbers may be used without explicit property specification. For example, the following are valid:

atom. 1,2,3
atom. CA

and returns respectively atoms 1,2 and 3 and any alpha carbons.

Generalized atom properties

Some structures may have additional properties available. These are referenced directly by their data names appended onto the atom. class. These properties are either of integer, real, boolean or string type and the datanames are encoded as beginning with i_, r_, b_ or s_ respectively. It is possible to use these atom properties in conjunction with any other ASL expression. Any atoms that don't have these properties associated with them will never match.

Some examples of using the ASL to address these properties are:

atom.i_my_integer_prop 1-4
atom.b_my_boolean_prop
atom.r_my_real_prop < 4.0
atom.s_my_string_prop LIG_

Operators

A number of operators are supported:

Operator Priority

The order of priority of operators is (in decreasing order):

If two equal-priority operators are used in a single ASL expression, they are evaluated in the order in which they are encountered, left to right. For example, the expression:

within 5.0 mol. 1 or mol. 2

returns the set of all atoms that are within 5.0 Å of either molecule 1 or molecule 2 ('or' has higher priority). The expression:

not atom.ptype CA,C,O,N and mol. 1

returns the set of molecule 1 side chain atoms. The following expression will define all alpha carbons and atoms in hydrophobic residues of molecule 1

atom.ptype CA or mol. 1 and not res.pol polar

Parentheses can be used to override the order of evaluation. For example, the expression

not (atom.ptype CA,C,O,N or mol. 1)

produces all atoms either not in the backbone or not in molecule 1.

Implicit Operators

When no operator is specified, the following operations are assumed:

Creating New Sets from Existing Ones

The names of existing sets may be used in expressions if they are prefixed with the word set. For example, if two sets having the names S1 and S2 are defined as:

S1: mol. 1
S2: atom.ptype C,O,N,CA

the following would be valid atom specifications:

set S1 and set S2
set S1 or set S2
within 5.0 set S1

Matching of PDB Atom Names

The following strategy must be used to specify atoms using PDB atom names. Before matching an unquoted name that begins with a non-numeric character, a blank character is inserted in front of it, and it is padded with blanks from the right so that there is a total of four characters.

Examples:

Initial blank characters are not added to unquoted names that begin with numbers. However, right-padding characters are added so that there is a total of four characters.

Example:

Names with either double or single quotes are treated as is, except that they are right-padded if necessary with blanks so that each name has a total of four characters.

Examples:

The difference in the first example is significant: the first matches an alpha carbon, the second matches a calcium.

Miscellaneous

Useful Hints when using ASL with the Project Facility

The order in which entries are included into and excluded from the Workspace affects the molecule numbers. For example, if you have two entries called "A" and "B" and you include into an empty Workspace first A and then B, the molecule numbers will be 1 for A and 2 for B; however, if you first include B and then A, the molecule numbers will be 1 for B and 2 for A.

This means that an expression such as mol. 1 matches different atoms in each of the above cases. In most cases it makes more sense to use entry ids or entry names.

For example, if you have an inhibitor and a receptor that are in different entries and wish to have a ribbon appear on only the receptor, use the entry name in the ASL expression, not the molecule number. This will ensure that when the receptor is included that it, and only it, will be used to generate the ribbons. Different inclusion order of entries in the Workspace now result in the same matching atoms. So for ribbons with a receptor called receptor it would be more useful to use entry.name receptor as the ASL definition.

ASL Examples

This section gives some examples of the use of the ASL in real-life situations. Note that while these examples all use lower-case, the ASL expressions themselves are not case sensitive.

Related Topics


Legal Notice

File: misc/asl.html
Last updated: 20 Jun 2014