Pybel API

pybel - A Python module that simplifies access to the Open Babel API

Global variables:
informats, outformats, descs, fps, forcefields, operations
Functions:
readfile(), readstring()
Classes:
Atom, Molecule, Outputfile, Fingerprint, Smarts, MoleculeData

Note

The openbabel.py module can be accessed through the ob global variable.

class pybel.Atom(OBAtom)

Represent an atom.

Required parameter:
OBAtom – an Open Babel :obapi:`OBAtom`
Attributes:
atomicmass, atomicnum, coords, exactmass, formalcharge, heavyvalence, heterovalence, hyb, idx, implicitvalence, isotope, partialcharge, spin, type, valence, vector.

The underlying Open Babel :obapi:`OBAtom` can be accessed using the attribute:

OBAtom
atomicmass

Atomic mass

atomicnum

Atomic number

coords

Coordinates of the atom

exactmass

Exact mass

formalcharge

Formal charge

heavyvalence

Number of non-hydrogen atoms attached

heterovalence

Number of heteroatoms attached

hyb

The hybridization of this atom: 1 for sp, 2 for sp2, 3 for sp3, …

For further details see :obapi:`OBAtom::GetHyb() <OpenBabel::OBAtom::GetHyb>`

idx

The index of the atom in the molecule (starts at 1)

implicitvalence

The maximum number of connections expected for this molecule

isotope

The isotope for this atom if specified; 0 otherwise.

partialcharge

Partial charge

spin

Spin multiplicity

type

Atom type

valence

Number of explicit connections

vector

Coordinates as a :obapi:`vector3` object.

class pybel.Fingerprint(fingerprint)

A molecular fingerprint.

Required parameters:
fingerprint – a vector calculated by :obapi:`OBFingerprint::FindFingerprint() <OpenBabel::OBFingerprint::FindFingerprint>`
Attributes:
bits
Methods:

The | operator can be used to calculate the Tanimoto coefficient. For example, given two Fingerprints a and b, the Tanimoto coefficient is given by:

tanimoto = a | b

The underlying fingerprint object can be accessed using the attribute fp.

bits

A list of bits set in the fingerprint

class pybel.Molecule(OBMol)

Represent a Pybel Molecule.

Required parameter:
OBMol – an Open Babel :obapi:`OBMol` or any type of Cinfony Molecule
Attributes:
atoms, charge, conformers, data, dim, energy, exactmass, formula, molwt, spin, sssr, title, unitcell.
Methods:
addh(), calcfp(), calcdesc(), draw(), localopt(), make3D(), removeh(), write()
The underlying Open Babel :obapi:`OBMol` can be accessed using the attribute:
OBMol

An iterator (__iter__()) is provided that iterates over the atoms of the molecule. This allows constructions such as the following:

for atom in mymol:
    print(atom)    
addh()

Add hydrogens.

atoms

A list of atoms of the molecule

calcdesc(descnames=[])

Calculate descriptor values.

Optional parameter:
descnames – a list of names of descriptors
See the descs variable for a list of available descriptors.

If descnames is not specified, all available descriptors are calculated.

calcfp(fptype='FP2')

Calculate a molecular fingerprint.

Optional parameters:
fptype – the fingerprint type (default is FP2).
See the fps variable for a list of of available fingerprint types.
charge

The charge on the molecule

conformers

Conformers of the molecule

data

Access the molecule’s data through a dictionary-like object, MoleculeData.

dim

Are the coordinates 2D, 3D or 0D?

draw(show=True, filename=None, update=False, usecoords=False)

Create a 2D depiction of the molecule.

Optional parameters:

show – display on screen (default is True)

filename – write to file (default is None)

update – update the coordinates of the atoms
This sets the atom coordinates to those determined by the structure diagram generator (default is False)
usecoords – use the current coordinates
This causes the current coordinates to be used instead of calculating new 2D coordinates (default is False)

OASA is used for 2D coordinate generation and depiction. Tkinter and Python Imaging Library are required for image display.

energy

The molecule’s energy

exactmass

The exact mass

formula

The molecular formula

localopt(forcefield='mmff94', steps=500)

Locally optimize the coordinates.

Optional parameters:
forcefield – default is mmff94.
See the forcefields variable for a list of available forcefields.

steps – default is 500

If the molecule does not have any coordinates, make3D() is called before the optimization. Note that the molecule needs to have explicit hydrogens. If not, call addh().

make3D(forcefield='mmff94', steps=50)

Generate 3D coordinates.

Optional parameters:
forcefield – default is mmff94.
See the forcefields variable for a list of available forcefields.

steps – default is 50

Once coordinates are generated, hydrogens are added and a quick local optimization is carried out with 50 steps and the MMFF94 forcefield. Call localopt() if you want to improve the coordinates further.

molwt

The molecular weight

removeh()

Remove hydrogens.

spin

The spin multiplicity

sssr

The Smallest Set of Smallest Rings (SSSR)

title

The molecule title

unitcell

Access any unit cell data

write(format='smi', filename=None, overwrite=False)

Write the molecule to a file or return a string.

Optional parameters:
format – chemical file format
See the outformats variable for a list of available output formats (default is smi)

filename – default is None

overwrite – overwrite the output file if it already exists?
Default is False.

If a filename is specified, the result is written to a file. Otherwise, a string is returned containing the result.

To write multiple molecules to the same file you should use the Outputfile class.

class pybel.MoleculeData(obmol)

Store molecule data in a dictionary-type object

Required parameters:
obmol – an Open Babel :obapi:`OBMol`

Methods and accessor methods are like those of a dictionary except that the data is retrieved on-the-fly from the underlying :obapi:`OBMol`.

Example:

>>> mol = readfile("sdf", 'head.sdf').next() # Python 2
>>> # mol = next(readfile("sdf", 'head.sdf')) # Python 3
>>> data = mol.data
>>> print(data)
{'Comment': 'CORINA 2.61 0041  25.10.2001', 'NSC': '1'}
>>> print(len(data), data.keys(), data.has_key("NSC"))
2 ['Comment', 'NSC'] True
>>> print(data['Comment'])
CORINA 2.61 0041  25.10.2001
>>> data['Comment'] = 'This is a new comment'
>>> for k,v in data.items():
...    print(k, "-->", v)
Comment --> This is a new comment
NSC --> 1
>>> del data['NSC']
>>> print(len(data), data.keys(), data.has_key("NSC"))
1 ['Comment'] False
clear()
has_key(key)
items()
iteritems()
keys()
update(dictionary)
values()
class pybel.Outputfile(format, filename, overwrite=False, opt=None)

Represent a file to which output is to be sent.

Although it’s possible to write a single molecule to a file by calling the write() method of a Molecule, if multiple molecules are to be written to the same file you should use the Outputfile class.

Required parameters:
format – chemical file format
See the outformats variable for a list of available output formats

filename

Optional parameters:
overwrite – overwrite the output file if it already exists?
Default is False
opt – a dictionary of format-specific options
For format options with no parameters, specify the value as None.
Methods:
write(), close()
close()

Close the output file to further writing.

write(molecule)

Write a molecule to the output file.

Required parameters:
molecule – A Molecule
class pybel.Smarts(smartspattern)

A Smarts Pattern Matcher

Required parameters:
smartspattern - A SMARTS pattern
Methods:
findall()

Example:

>>> mol = readstring("smi","CCN(CC)CC") # triethylamine
>>> smarts = Smarts("[#6][#6]") # Matches an ethyl group
>>> print(smarts.findall(mol)) 
[(1, 2), (4, 5), (6, 7)]

The numbers returned are the indices (starting from 1) of the atoms that match the SMARTS pattern. In this case, there are three matches for each of the three ethyl groups in the molecule.

findall(molecule)

Find all matches of the SMARTS pattern to a particular molecule.

Required parameters:
molecule - A Molecule
pybel.descs = ['LogP', 'MR', 'TPSA']

A list of supported descriptors

pybel.forcefields = ['uff', 'mmff94', 'ghemical']

A list of supported forcefields

pybel.fps = ['FP2', 'FP3', 'FP4']

A list of supported fingerprint types

pybel.informats = {}

A dictionary of supported input formats

pybel.operations = ['Gen3D']

A list of supported operations

pybel.outformats = {}

A dictionary of supported output formats

pybel.readfile(format, filename, opt=None)

Iterate over the molecules in a file.

Required parameters:
format – chemical file format
See the informats variable for a list of available input formats

filename

Optional parameters:
opt – a dictionary of format-specific options
For format options with no parameters, specify the value as None.

You can access the first molecule in a file using the next() method of the iterator (or the next() keyword in Python 3):

mol = readfile("smi", "myfile.smi").next() # Python 2
mol = next(readfile("smi", "myfile.smi"))  # Python 3

You can make a list of the molecules in a file using:

mols = list(readfile("smi", "myfile.smi"))

You can iterate over the molecules in a file as shown in the following code snippet:

>>> atomtotal = 0
>>> for mol in readfile("sdf", "head.sdf"):
...     atomtotal += len(mol.atoms)
...
>>> print(atomtotal)
43
pybel.readstring(format, string, opt=None)

Read in a molecule from a string.

Required parameters:
format – chemical file format
See the informats variable for a list of available input formats

string

Optional parameters:
opt – a dictionary of format-specific options
For format options with no parameters, specify the value as None.

Example:

>>> input = "C1=CC=CS1"
>>> mymol = readstring("smi", input)
>>> len(mymol.atoms)
5