Pybel API¶
pybel - A Python module that simplifies access to the Open Babel API
- Global variables:
informats
,outformats
,descs
,fps
,forcefields
,operations
- Functions:
readfile()
,readstring()
- Classes:
Atom
,Molecule
,Outputfile
,Fingerprint
,Smarts
,MoleculeData
Note
The openbabel.py
module can be accessed through the ob
global
variable.
-
class
pybel.
Atom
(OBAtom)¶ Represent an atom.
- Required parameter:
- OBAtom – an Open Babel :obapi:`OBAtom`
- Attributes:
atomicmass
,atomicnum
,coords
,exactmass
,formalcharge
,heavyvalence
,heterovalence
,hyb
,idx
,implicitvalence
,isotope
,partialcharge
,spin
,type
,valence
,vector
.
The underlying Open Babel :obapi:`OBAtom` can be accessed using the attribute:
OBAtom
-
atomicmass
¶ Atomic mass
-
atomicnum
¶ Atomic number
-
coords
¶ Coordinates of the atom
-
exactmass
¶ Exact mass
-
formalcharge
¶ Formal charge
-
heavyvalence
¶ Number of non-hydrogen atoms attached
-
heterovalence
¶ Number of heteroatoms attached
-
hyb
¶ The hybridization of this atom: 1 for sp, 2 for sp2, 3 for sp3, …
For further details see :obapi:`OBAtom::GetHyb() <OpenBabel::OBAtom::GetHyb>`
-
idx
¶ The index of the atom in the molecule (starts at 1)
-
implicitvalence
¶ The maximum number of connections expected for this molecule
-
isotope
¶ The isotope for this atom if specified; 0 otherwise.
-
partialcharge
¶ Partial charge
-
spin
¶ Spin multiplicity
-
type
¶ Atom type
-
valence
¶ Number of explicit connections
-
vector
¶ Coordinates as a :obapi:`vector3` object.
-
class
pybel.
Fingerprint
(fingerprint)¶ A molecular fingerprint.
- Required parameters:
- fingerprint – a vector calculated by :obapi:`OBFingerprint::FindFingerprint() <OpenBabel::OBFingerprint::FindFingerprint>`
- Attributes:
bits
- Methods:
The
|
operator can be used to calculate the Tanimoto coefficient. For example, given two Fingerprints a and b, the Tanimoto coefficient is given by:tanimoto = a | b
The underlying fingerprint object can be accessed using the attribute
fp
.-
bits
¶ A list of bits set in the fingerprint
-
class
pybel.
Molecule
(OBMol)¶ Represent a Pybel Molecule.
- Required parameter:
- OBMol – an Open Babel :obapi:`OBMol` or any type of Cinfony Molecule
- Attributes:
atoms
,charge
,conformers
,data
,dim
,energy
,exactmass
,formula
,molwt
,spin
,sssr
,title
,unitcell
.- Methods:
addh()
,calcfp()
,calcdesc()
,draw()
,localopt()
,make3D()
,removeh()
,write()
- The underlying Open Babel :obapi:`OBMol` can be accessed using the attribute:
OBMol
An iterator (
__iter__()
) is provided that iterates over the atoms of the molecule. This allows constructions such as the following:for atom in mymol: print(atom)
-
addh
()¶ Add hydrogens.
-
atoms
¶ A list of atoms of the molecule
-
calcdesc
(descnames=[])¶ Calculate descriptor values.
- Optional parameter:
- descnames – a list of names of descriptors
- See the
descs
variable for a list of available descriptors.
If descnames is not specified, all available descriptors are calculated.
-
calcfp
(fptype='FP2')¶ Calculate a molecular fingerprint.
- Optional parameters:
- fptype – the fingerprint type (default is
FP2
). - See the
fps
variable for a list of of available fingerprint types.
- fptype – the fingerprint type (default is
-
charge
¶ The charge on the molecule
-
conformers
¶ Conformers of the molecule
-
data
¶ Access the molecule’s data through a dictionary-like object,
MoleculeData
.
-
dim
¶ Are the coordinates 2D, 3D or 0D?
-
draw
(show=True, filename=None, update=False, usecoords=False)¶ Create a 2D depiction of the molecule.
Optional parameters:
show – display on screen (default is
True
)filename – write to file (default is
None
)- update – update the coordinates of the atoms
- This sets the atom coordinates to those
determined by the structure diagram generator
(default is
False
) - usecoords – use the current coordinates
- This causes the current coordinates to be used
instead of calculating new 2D coordinates
(default is
False
)
OASA is used for 2D coordinate generation and depiction. Tkinter and Python Imaging Library are required for image display.
-
energy
¶ The molecule’s energy
-
exactmass
¶ The exact mass
-
formula
¶ The molecular formula
-
localopt
(forcefield='mmff94', steps=500)¶ Locally optimize the coordinates.
- Optional parameters:
- forcefield – default is
mmff94
. - See the
forcefields
variable for a list of available forcefields.
steps – default is 500
- forcefield – default is
If the molecule does not have any coordinates,
make3D()
is called before the optimization. Note that the molecule needs to have explicit hydrogens. If not, calladdh()
.
-
make3D
(forcefield='mmff94', steps=50)¶ Generate 3D coordinates.
- Optional parameters:
- forcefield – default is
mmff94
. - See the
forcefields
variable for a list of available forcefields.
steps – default is
50
- forcefield – default is
Once coordinates are generated, hydrogens are added and a quick local optimization is carried out with 50 steps and the MMFF94 forcefield. Call
localopt()
if you want to improve the coordinates further.
-
molwt
¶ The molecular weight
-
removeh
()¶ Remove hydrogens.
-
spin
¶ The spin multiplicity
-
sssr
¶ The Smallest Set of Smallest Rings (SSSR)
-
title
¶ The molecule title
-
unitcell
¶ Access any unit cell data
-
write
(format='smi', filename=None, overwrite=False)¶ Write the molecule to a file or return a string.
- Optional parameters:
- format – chemical file format
- See the
outformats
variable for a list of available output formats (default issmi
)
filename – default is
None
- overwrite – overwrite the output file if it already exists?
- Default is
False
.
If a filename is specified, the result is written to a file. Otherwise, a string is returned containing the result.
To write multiple molecules to the same file you should use the
Outputfile
class.
-
class
pybel.
MoleculeData
(obmol)¶ Store molecule data in a dictionary-type object
- Required parameters:
- obmol – an Open Babel :obapi:`OBMol`
Methods and accessor methods are like those of a dictionary except that the data is retrieved on-the-fly from the underlying :obapi:`OBMol`.
Example:
>>> mol = readfile("sdf", 'head.sdf').next() # Python 2 >>> # mol = next(readfile("sdf", 'head.sdf')) # Python 3 >>> data = mol.data >>> print(data) {'Comment': 'CORINA 2.61 0041 25.10.2001', 'NSC': '1'} >>> print(len(data), data.keys(), data.has_key("NSC")) 2 ['Comment', 'NSC'] True >>> print(data['Comment']) CORINA 2.61 0041 25.10.2001 >>> data['Comment'] = 'This is a new comment' >>> for k,v in data.items(): ... print(k, "-->", v) Comment --> This is a new comment NSC --> 1 >>> del data['NSC'] >>> print(len(data), data.keys(), data.has_key("NSC")) 1 ['Comment'] False
-
clear
()¶
-
has_key
(key)¶
-
items
()¶
-
iteritems
()¶
-
keys
()¶
-
update
(dictionary)¶
-
values
()¶
-
class
pybel.
Outputfile
(format, filename, overwrite=False, opt=None)¶ Represent a file to which output is to be sent.
Although it’s possible to write a single molecule to a file by calling the
write()
method of aMolecule
, if multiple molecules are to be written to the same file you should use theOutputfile
class.- Required parameters:
- format – chemical file format
- See the
outformats
variable for a list of available output formats
filename
- Optional parameters:
- overwrite – overwrite the output file if it already exists?
- Default is
False
- opt – a dictionary of format-specific options
- For format options with no parameters, specify the value as None.
- Methods:
write()
,close()
-
close
()¶ Close the output file to further writing.
-
class
pybel.
Smarts
(smartspattern)¶ A Smarts Pattern Matcher
- Required parameters:
- smartspattern - A SMARTS pattern
- Methods:
findall()
Example:
>>> mol = readstring("smi","CCN(CC)CC") # triethylamine >>> smarts = Smarts("[#6][#6]") # Matches an ethyl group >>> print(smarts.findall(mol)) [(1, 2), (4, 5), (6, 7)]
The numbers returned are the indices (starting from 1) of the atoms that match the SMARTS pattern. In this case, there are three matches for each of the three ethyl groups in the molecule.
-
pybel.
descs
= ['LogP', 'MR', 'TPSA']¶ A list of supported descriptors
-
pybel.
forcefields
= ['uff', 'mmff94', 'ghemical']¶ A list of supported forcefields
-
pybel.
fps
= ['FP2', 'FP3', 'FP4']¶ A list of supported fingerprint types
-
pybel.
informats
= {}¶ A dictionary of supported input formats
-
pybel.
operations
= ['Gen3D']¶ A list of supported operations
-
pybel.
outformats
= {}¶ A dictionary of supported output formats
-
pybel.
readfile
(format, filename, opt=None)¶ Iterate over the molecules in a file.
- Required parameters:
- format – chemical file format
- See the
informats
variable for a list of available input formats
filename
- Optional parameters:
- opt – a dictionary of format-specific options
- For format options with no parameters, specify the value as None.
You can access the first molecule in a file using the next() method of the iterator (or the next() keyword in Python 3):
mol = readfile("smi", "myfile.smi").next() # Python 2 mol = next(readfile("smi", "myfile.smi")) # Python 3
You can make a list of the molecules in a file using:
mols = list(readfile("smi", "myfile.smi"))
You can iterate over the molecules in a file as shown in the following code snippet:
>>> atomtotal = 0 >>> for mol in readfile("sdf", "head.sdf"): ... atomtotal += len(mol.atoms) ... >>> print(atomtotal) 43
-
pybel.
readstring
(format, string, opt=None)¶ Read in a molecule from a string.
- Required parameters:
- format – chemical file format
- See the
informats
variable for a list of available input formats
string
- Optional parameters:
- opt – a dictionary of format-specific options
- For format options with no parameters, specify the value as None.
Example:
>>> input = "C1=CC=CS1" >>> mymol = readstring("smi", input) >>> len(mymol.atoms) 5