10 minutes to pdb2sql

This is a short introduction to pdb2sql.

Download PDB files

A handy tool fetch is provided to download PDB files from PDB website.

In [1]: from pdb2sql import fetch

In [2]: fetch('3CRO', './pdb/')

In [3]: ls ./pdb
1AK4_10w.pdb  3CRO.pdb	 dummy.pdb	      ref.pdb
1AK4_5w.pdb   decoy.pdb  dummy_transform.pdb  test.pdb

For clear illustration, some dummy PDB files are used in the following examples.

Get and set data

First, we import as follows:

In [4]: from pdb2sql import pdb2sql

Create a SQL database instance:

In [5]: db = pdb2sql("./pdb/dummy.pdb")

The db is a SQL instance that contains one table named ATOM.

In this table, each row represents one atom, and columns are atom properties:

In [6]: db.print()
serial	name	altLoc	resName	chainID	resSeq	iCode	x	y	z	occ	temp	element	model
1	N		MET	A	1		-20.948	-13.418	28.32	1.0	46.93	N	0
2	CA		MET	A	1		-21.093	-12.112	28.939	1.0	52.5	C	0
3	C		MET	A	1		-22.482	-11.566	28.846	1.0	52.55	C	0
4	O		MET	A	1		-22.816	-10.393	28.618	1.0	52.75	O	0
5	CB		MET	A	1		-19.916	-11.178	28.789	1.0	59.92	C	0
6	CG		MET	A	1		-18.839	-11.701	29.713	1.0	80.88	C	0
7	SD		MET	A	1		-17.178	-11.517	29.038	1.0	95.94	S	0
8	CE		MET	A	1		-16.527	-13.173	29.365	1.0	90.58	C	0
9	N		GLN	A	2		-23.243	-12.593	29.074	1.0	51.78	N	0
10	CA		GLN	A	2		-24.639	-12.681	29.076	1.0	52.49	C	0
11	C		GLN	A	2		-25.268	-12.252	30.349	1.0	42.74	C	0
12	O		GLN	A	2		-24.688	-12.207	31.435	1.0	47.12	O	0
13	CB		GLN	A	2		-24.971	-14.147	28.858	1.0	45.95	C	0
14	CG		GLN	A	2		-24.141	-14.712	27.71	1.0	53.26	C	0
15	CD		GLN	A	2		-24.923	-15.776	27.001	1.0	68.74	C	0
16	OE1		GLN	A	2		-25.159	-16.851	27.563	1.0	82.61	O	0
17	NE2		GLN	A	2		-25.382	-15.458	25.797	1.0	76.83	N	0
18	N		THR	A	3		-26.513	-11.973	30.116	1.0	21.94	N	0
19	CA		THR	A	3		-27.44	-11.567	31.088	1.0	15.55	C	0
20	C		THR	A	3		-28.2	-12.824	31.459	1.0	5.55	C	0
21	O		THR	A	3		-27.96	-13.91	30.947	1.0	13.48	O	0
22	CB		THR	A	3		-28.318	-10.497	30.412	1.0	14.6	C	0
23	OG1		THR	A	3		-27.55	-9.329	30.158	1.0	7.6	O	0
24	CG2		THR	A	3		-29.542	-10.173	31.249	1.0	14.52	C	0
25	N		LEU	A	4		-29.107	-12.704	32.36	1.0	2.0	N	0
26	CA		LEU	A	4		-29.866	-13.85	32.734	1.0	2.0	C	0
27	C		LEU	A	4		-31.054	-13.904	31.795	1.0	23.22	C	0
28	O		LEU	A	4		-31.605	-14.952	31.468	1.0	22.54	O	0
29	CB		LEU	A	4		-30.361	-13.645	34.167	1.0	2.0	C	0
30	CG		LEU	A	4		-31.598	-14.458	34.413	1.0	2.0	C	0
31	CD1		LEU	A	4		-31.138	-15.884	34.58	1.0	10.7	C	0
32	CD2		LEU	A	4		-32.285	-13.993	35.678	1.0	2.0	C	0
33	N		ILE	B	5		-24.269	-24.311	62.813	1.0	9.48	N	0
34	CA		ILE	B	5		-23.843	-24.77	64.102	1.0	26.84	C	0
35	C		ILE	B	5		-22.871	-25.894	64.054	1.0	25.15	C	0
36	O		ILE	B	5		-21.848	-25.947	64.74	1.0	26.7	O	0
37	CB		ILE	B	5		-25.073	-25.455	64.56	1.0	2.0	C	0
38	CG1		ILE	B	5		-26.132	-24.381	64.698	1.0	8.15	C	0
39	CG2		ILE	B	5		-24.748	-26.175	65.844	1.0	15.39	C	0
40	CD1		ILE	B	5		-27.169	-24.675	65.763	1.0	4.14	C	0
41	N		ALA	B	6		-23.314	-26.842	63.273	1.0	5.6	N	0
42	CA		ALA	B	6		-22.569	-28.021	63.037	1.0	6.15	C	0
43	C		ALA	B	6		-21.334	-27.605	62.3	1.0	2.0	C	0
44	O		ALA	B	6		-20.346	-28.31	62.26	1.0	19.27	O	0
45	CB		ALA	B	6		-23.39	-28.992	62.204	1.0	28.29	C	0
46	N		LYS	B	7		-20.313	-24.801	63.054	1.0	42.89	N	0
47	CA		LYS	B	7		-19.947	-23.992	64.206	1.0	35.06	C	0
48	C		LYS	B	7		-19.63	-22.543	63.931	1.0	28.97	C	0
49	O		LYS	B	7		-18.474	-22.128	63.903	1.0	23.79	O	0
50	CB		LYS	B	7		-19.233	-24.631	65.379	1.0	29.85	C	0
51	CG		LYS	B	7		-18.891	-26.083	65.15	1.0	33.26	C	0
52	CD		LYS	B	7		-17.459	-26.383	65.536	1.0	40.19	C	0
53	CE		LYS	B	7		-17.355	-27.043	66.897	1.0	63.4	C	0
54	NZ		LYS	B	7		-17.415	-28.509	66.822	1.0	66.95	N	0

Get data

Get chainID, residue number, residue name and atom name of all atoms:

In [7]: p = db.get('chainID, resSeq, resName, name')

In [8]: p
Out[8]: 
[['A', 1, 'MET', 'N'],
 ['A', 1, 'MET', 'CA'],
 ['A', 1, 'MET', 'C'],
 ['A', 1, 'MET', 'O'],
 ['A', 1, 'MET', 'CB'],
 ['A', 1, 'MET', 'CG'],
 ['A', 1, 'MET', 'SD'],
 ['A', 1, 'MET', 'CE'],
 ['A', 2, 'GLN', 'N'],
 ['A', 2, 'GLN', 'CA'],
 ['A', 2, 'GLN', 'C'],
 ['A', 2, 'GLN', 'O'],
 ['A', 2, 'GLN', 'CB'],
 ['A', 2, 'GLN', 'CG'],
 ['A', 2, 'GLN', 'CD'],
 ['A', 2, 'GLN', 'OE1'],
 ['A', 2, 'GLN', 'NE2'],
 ['A', 3, 'THR', 'N'],
 ['A', 3, 'THR', 'CA'],
 ['A', 3, 'THR', 'C'],
 ['A', 3, 'THR', 'O'],
 ['A', 3, 'THR', 'CB'],
 ['A', 3, 'THR', 'OG1'],
 ['A', 3, 'THR', 'CG2'],
 ['A', 4, 'LEU', 'N'],
 ['A', 4, 'LEU', 'CA'],
 ['A', 4, 'LEU', 'C'],
 ['A', 4, 'LEU', 'O'],
 ['A', 4, 'LEU', 'CB'],
 ['A', 4, 'LEU', 'CG'],
 ['A', 4, 'LEU', 'CD1'],
 ['A', 4, 'LEU', 'CD2'],
 ['B', 5, 'ILE', 'N'],
 ['B', 5, 'ILE', 'CA'],
 ['B', 5, 'ILE', 'C'],
 ['B', 5, 'ILE', 'O'],
 ['B', 5, 'ILE', 'CB'],
 ['B', 5, 'ILE', 'CG1'],
 ['B', 5, 'ILE', 'CG2'],
 ['B', 5, 'ILE', 'CD1'],
 ['B', 6, 'ALA', 'N'],
 ['B', 6, 'ALA', 'CA'],
 ['B', 6, 'ALA', 'C'],
 ['B', 6, 'ALA', 'O'],
 ['B', 6, 'ALA', 'CB'],
 ['B', 7, 'LYS', 'N'],
 ['B', 7, 'LYS', 'CA'],
 ['B', 7, 'LYS', 'C'],
 ['B', 7, 'LYS', 'O'],
 ['B', 7, 'LYS', 'CB'],
 ['B', 7, 'LYS', 'CG'],
 ['B', 7, 'LYS', 'CD'],
 ['B', 7, 'LYS', 'CE'],
 ['B', 7, 'LYS', 'NZ']]

Get x,y,z coordinates of all atoms:

In [9]: p = db.get('x,y,z')

In [10]: p
Out[10]: 
[[-20.948, -13.418, 28.32],
 [-21.093, -12.112, 28.939],
 [-22.482, -11.566, 28.846],
 [-22.816, -10.393, 28.618],
 [-19.916, -11.178, 28.789],
 [-18.839, -11.701, 29.713],
 [-17.178, -11.517, 29.038],
 [-16.527, -13.173, 29.365],
 [-23.243, -12.593, 29.074],
 [-24.639, -12.681, 29.076],
 [-25.268, -12.252, 30.349],
 [-24.688, -12.207, 31.435],
 [-24.971, -14.147, 28.858],
 [-24.141, -14.712, 27.71],
 [-24.923, -15.776, 27.001],
 [-25.159, -16.851, 27.563],
 [-25.382, -15.458, 25.797],
 [-26.513, -11.973, 30.116],
 [-27.44, -11.567, 31.088],
 [-28.2, -12.824, 31.459],
 [-27.96, -13.91, 30.947],
 [-28.318, -10.497, 30.412],
 [-27.55, -9.329, 30.158],
 [-29.542, -10.173, 31.249],
 [-29.107, -12.704, 32.36],
 [-29.866, -13.85, 32.734],
 [-31.054, -13.904, 31.795],
 [-31.605, -14.952, 31.468],
 [-30.361, -13.645, 34.167],
 [-31.598, -14.458, 34.413],
 [-31.138, -15.884, 34.58],
 [-32.285, -13.993, 35.678],
 [-24.269, -24.311, 62.813],
 [-23.843, -24.77, 64.102],
 [-22.871, -25.894, 64.054],
 [-21.848, -25.947, 64.74],
 [-25.073, -25.455, 64.56],
 [-26.132, -24.381, 64.698],
 [-24.748, -26.175, 65.844],
 [-27.169, -24.675, 65.763],
 [-23.314, -26.842, 63.273],
 [-22.569, -28.021, 63.037],
 [-21.334, -27.605, 62.3],
 [-20.346, -28.31, 62.26],
 [-23.39, -28.992, 62.204],
 [-20.313, -24.801, 63.054],
 [-19.947, -23.992, 64.206],
 [-19.63, -22.543, 63.931],
 [-18.474, -22.128, 63.903],
 [-19.233, -24.631, 65.379],
 [-18.891, -26.083, 65.15],
 [-17.459, -26.383, 65.536],
 [-17.355, -27.043, 66.897],
 [-17.415, -28.509, 66.822]]

Get x,y,z coordinates of chain A atoms:

In [11]: p = db.get('chainID, x,y,z', chainID=['A'])

In [12]: p
Out[12]: 
[['A', -20.948, -13.418, 28.32],
 ['A', -21.093, -12.112, 28.939],
 ['A', -22.482, -11.566, 28.846],
 ['A', -22.816, -10.393, 28.618],
 ['A', -19.916, -11.178, 28.789],
 ['A', -18.839, -11.701, 29.713],
 ['A', -17.178, -11.517, 29.038],
 ['A', -16.527, -13.173, 29.365],
 ['A', -23.243, -12.593, 29.074],
 ['A', -24.639, -12.681, 29.076],
 ['A', -25.268, -12.252, 30.349],
 ['A', -24.688, -12.207, 31.435],
 ['A', -24.971, -14.147, 28.858],
 ['A', -24.141, -14.712, 27.71],
 ['A', -24.923, -15.776, 27.001],
 ['A', -25.159, -16.851, 27.563],
 ['A', -25.382, -15.458, 25.797],
 ['A', -26.513, -11.973, 30.116],
 ['A', -27.44, -11.567, 31.088],
 ['A', -28.2, -12.824, 31.459],
 ['A', -27.96, -13.91, 30.947],
 ['A', -28.318, -10.497, 30.412],
 ['A', -27.55, -9.329, 30.158],
 ['A', -29.542, -10.173, 31.249],
 ['A', -29.107, -12.704, 32.36],
 ['A', -29.866, -13.85, 32.734],
 ['A', -31.054, -13.904, 31.795],
 ['A', -31.605, -14.952, 31.468],
 ['A', -30.361, -13.645, 34.167],
 ['A', -31.598, -14.458, 34.413],
 ['A', -31.138, -15.884, 34.58],
 ['A', -32.285, -13.993, 35.678]]

Get x,y,z coordinates of atoms on residue 1 and 4 of Chain A

In [13]: p = db.get('chainID,resSeq,x,y,z', chainID=['A'], resSeq=['1', '4'])

In [14]: p
Out[14]: 
[['A', 1, -20.948, -13.418, 28.32],
 ['A', 1, -21.093, -12.112, 28.939],
 ['A', 1, -22.482, -11.566, 28.846],
 ['A', 1, -22.816, -10.393, 28.618],
 ['A', 1, -19.916, -11.178, 28.789],
 ['A', 1, -18.839, -11.701, 29.713],
 ['A', 1, -17.178, -11.517, 29.038],
 ['A', 1, -16.527, -13.173, 29.365],
 ['A', 4, -29.107, -12.704, 32.36],
 ['A', 4, -29.866, -13.85, 32.734],
 ['A', 4, -31.054, -13.904, 31.795],
 ['A', 4, -31.605, -14.952, 31.468],
 ['A', 4, -30.361, -13.645, 34.167],
 ['A', 4, -31.598, -14.458, 34.413],
 ['A', 4, -31.138, -15.884, 34.58],
 ['A', 4, -32.285, -13.993, 35.678]]

Get data of all atoms except residue MET and GLN atoms

In [15]: p = db.get('chainID, resSeq, resName, name', no_resName = ['MET', 'GLN'])

In [16]: p
Out[16]: 
[['A', 3, 'THR', 'N'],
 ['A', 3, 'THR', 'CA'],
 ['A', 3, 'THR', 'C'],
 ['A', 3, 'THR', 'O'],
 ['A', 3, 'THR', 'CB'],
 ['A', 3, 'THR', 'OG1'],
 ['A', 3, 'THR', 'CG2'],
 ['A', 4, 'LEU', 'N'],
 ['A', 4, 'LEU', 'CA'],
 ['A', 4, 'LEU', 'C'],
 ['A', 4, 'LEU', 'O'],
 ['A', 4, 'LEU', 'CB'],
 ['A', 4, 'LEU', 'CG'],
 ['A', 4, 'LEU', 'CD1'],
 ['A', 4, 'LEU', 'CD2'],
 ['B', 5, 'ILE', 'N'],
 ['B', 5, 'ILE', 'CA'],
 ['B', 5, 'ILE', 'C'],
 ['B', 5, 'ILE', 'O'],
 ['B', 5, 'ILE', 'CB'],
 ['B', 5, 'ILE', 'CG1'],
 ['B', 5, 'ILE', 'CG2'],
 ['B', 5, 'ILE', 'CD1'],
 ['B', 6, 'ALA', 'N'],
 ['B', 6, 'ALA', 'CA'],
 ['B', 6, 'ALA', 'C'],
 ['B', 6, 'ALA', 'O'],
 ['B', 6, 'ALA', 'CB'],
 ['B', 7, 'LYS', 'N'],
 ['B', 7, 'LYS', 'CA'],
 ['B', 7, 'LYS', 'C'],
 ['B', 7, 'LYS', 'O'],
 ['B', 7, 'LYS', 'CB'],
 ['B', 7, 'LYS', 'CG'],
 ['B', 7, 'LYS', 'CD'],
 ['B', 7, 'LYS', 'CE'],
 ['B', 7, 'LYS', 'NZ']]

Get data of all atoms except residue MET and GLN atoms or CA (carbon alpha) atoms

In [17]: p = db.get('chainID, resSeq, resName, name', no_resName = ['MET', 'GLN'], no_name = ['CA'])

In [18]: p
Out[18]: 
[['A', 3, 'THR', 'N'],
 ['A', 3, 'THR', 'C'],
 ['A', 3, 'THR', 'O'],
 ['A', 3, 'THR', 'CB'],
 ['A', 3, 'THR', 'OG1'],
 ['A', 3, 'THR', 'CG2'],
 ['A', 4, 'LEU', 'N'],
 ['A', 4, 'LEU', 'C'],
 ['A', 4, 'LEU', 'O'],
 ['A', 4, 'LEU', 'CB'],
 ['A', 4, 'LEU', 'CG'],
 ['A', 4, 'LEU', 'CD1'],
 ['A', 4, 'LEU', 'CD2'],
 ['B', 5, 'ILE', 'N'],
 ['B', 5, 'ILE', 'C'],
 ['B', 5, 'ILE', 'O'],
 ['B', 5, 'ILE', 'CB'],
 ['B', 5, 'ILE', 'CG1'],
 ['B', 5, 'ILE', 'CG2'],
 ['B', 5, 'ILE', 'CD1'],
 ['B', 6, 'ALA', 'N'],
 ['B', 6, 'ALA', 'C'],
 ['B', 6, 'ALA', 'O'],
 ['B', 6, 'ALA', 'CB'],
 ['B', 7, 'LYS', 'N'],
 ['B', 7, 'LYS', 'C'],
 ['B', 7, 'LYS', 'O'],
 ['B', 7, 'LYS', 'CB'],
 ['B', 7, 'LYS', 'CG'],
 ['B', 7, 'LYS', 'CD'],
 ['B', 7, 'LYS', 'CE'],
 ['B', 7, 'LYS', 'NZ']]

Get all data, a simple way is db.get('*').

A shortcut to get x,y,z coordinates:

In [19]: p = db.get_xyz()

In [20]: p
Out[20]: 
[[-20.948, -13.418, 28.32],
 [-21.093, -12.112, 28.939],
 [-22.482, -11.566, 28.846],
 [-22.816, -10.393, 28.618],
 [-19.916, -11.178, 28.789],
 [-18.839, -11.701, 29.713],
 [-17.178, -11.517, 29.038],
 [-16.527, -13.173, 29.365],
 [-23.243, -12.593, 29.074],
 [-24.639, -12.681, 29.076],
 [-25.268, -12.252, 30.349],
 [-24.688, -12.207, 31.435],
 [-24.971, -14.147, 28.858],
 [-24.141, -14.712, 27.71],
 [-24.923, -15.776, 27.001],
 [-25.159, -16.851, 27.563],
 [-25.382, -15.458, 25.797],
 [-26.513, -11.973, 30.116],
 [-27.44, -11.567, 31.088],
 [-28.2, -12.824, 31.459],
 [-27.96, -13.91, 30.947],
 [-28.318, -10.497, 30.412],
 [-27.55, -9.329, 30.158],
 [-29.542, -10.173, 31.249],
 [-29.107, -12.704, 32.36],
 [-29.866, -13.85, 32.734],
 [-31.054, -13.904, 31.795],
 [-31.605, -14.952, 31.468],
 [-30.361, -13.645, 34.167],
 [-31.598, -14.458, 34.413],
 [-31.138, -15.884, 34.58],
 [-32.285, -13.993, 35.678],
 [-24.269, -24.311, 62.813],
 [-23.843, -24.77, 64.102],
 [-22.871, -25.894, 64.054],
 [-21.848, -25.947, 64.74],
 [-25.073, -25.455, 64.56],
 [-26.132, -24.381, 64.698],
 [-24.748, -26.175, 65.844],
 [-27.169, -24.675, 65.763],
 [-23.314, -26.842, 63.273],
 [-22.569, -28.021, 63.037],
 [-21.334, -27.605, 62.3],
 [-20.346, -28.31, 62.26],
 [-23.39, -28.992, 62.204],
 [-20.313, -24.801, 63.054],
 [-19.947, -23.992, 64.206],
 [-19.63, -22.543, 63.931],
 [-18.474, -22.128, 63.903],
 [-19.233, -24.631, 65.379],
 [-18.891, -26.083, 65.15],
 [-17.459, -26.383, 65.536],
 [-17.355, -27.043, 66.897],
 [-17.415, -28.509, 66.822]]

Get chain IDs:

In [21]: p = db.get_chains()

In [22]: p
Out[22]: ['A', 'B']

Get residue list:

In [23]: p = db.get_residues()

In [24]: p
Out[24]: 
[('A', 'MET', 1),
 ('A', 'GLN', 2),
 ('A', 'THR', 3),
 ('A', 'LEU', 4),
 ('B', 'ILE', 5),
 ('B', 'ALA', 6),
 ('B', 'LYS', 7)]

Filter the data base

pdb2sql allows to create a new database by filtering the one we jut created

In [25]: db_chainA = db(chainID='A')

In [26]: db_chainA.print()
serial	name	altLoc	resName	chainID	resSeq	iCode	x	y	z	occ	temp	element	model
1	N		MET	A	1		-20.948	-13.418	28.32	1.0	46.93	N	0
2	CA		MET	A	1		-21.093	-12.112	28.939	1.0	52.5	C	0
3	C		MET	A	1		-22.482	-11.566	28.846	1.0	52.55	C	0
4	O		MET	A	1		-22.816	-10.393	28.618	1.0	52.75	O	0
5	CB		MET	A	1		-19.916	-11.178	28.789	1.0	59.92	C	0
6	CG		MET	A	1		-18.839	-11.701	29.713	1.0	80.88	C	0
7	SD		MET	A	1		-17.178	-11.517	29.038	1.0	95.94	S	0
8	CE		MET	A	1		-16.527	-13.173	29.365	1.0	90.58	C	0
9	N		GLN	A	2		-23.243	-12.593	29.074	1.0	51.78	N	0
10	CA		GLN	A	2		-24.639	-12.681	29.076	1.0	52.49	C	0
11	C		GLN	A	2		-25.268	-12.252	30.349	1.0	42.74	C	0
12	O		GLN	A	2		-24.688	-12.207	31.435	1.0	47.12	O	0
13	CB		GLN	A	2		-24.971	-14.147	28.858	1.0	45.95	C	0
14	CG		GLN	A	2		-24.141	-14.712	27.71	1.0	53.26	C	0
15	CD		GLN	A	2		-24.923	-15.776	27.001	1.0	68.74	C	0
16	OE1		GLN	A	2		-25.159	-16.851	27.563	1.0	82.61	O	0
17	NE2		GLN	A	2		-25.382	-15.458	25.797	1.0	76.83	N	0
18	N		THR	A	3		-26.513	-11.973	30.116	1.0	21.94	N	0
19	CA		THR	A	3		-27.44	-11.567	31.088	1.0	15.55	C	0
20	C		THR	A	3		-28.2	-12.824	31.459	1.0	5.55	C	0
21	O		THR	A	3		-27.96	-13.91	30.947	1.0	13.48	O	0
22	CB		THR	A	3		-28.318	-10.497	30.412	1.0	14.6	C	0
23	OG1		THR	A	3		-27.55	-9.329	30.158	1.0	7.6	O	0
24	CG2		THR	A	3		-29.542	-10.173	31.249	1.0	14.52	C	0
25	N		LEU	A	4		-29.107	-12.704	32.36	1.0	2.0	N	0
26	CA		LEU	A	4		-29.866	-13.85	32.734	1.0	2.0	C	0
27	C		LEU	A	4		-31.054	-13.904	31.795	1.0	23.22	C	0
28	O		LEU	A	4		-31.605	-14.952	31.468	1.0	22.54	O	0
29	CB		LEU	A	4		-30.361	-13.645	34.167	1.0	2.0	C	0
30	CG		LEU	A	4		-31.598	-14.458	34.413	1.0	2.0	C	0
31	CD1		LEU	A	4		-31.138	-15.884	34.58	1.0	10.7	C	0
32	CD2		LEU	A	4		-32.285	-13.993	35.678	1.0	2.0	C	0

In that example dp_chainA is a sql database that only includes the atoms from chain A. All the selection keywords (chainID, resSeq, resName, name) and their negations (no_chainID, no_resSeq, no_resName, no_name) can be used and combined to obtain the new database.

Set data

Rename chain B to C:

In [27]: num_B_atoms = len(db.get('chainID', chainID=['B']))

In [28]: chainC = ['C'] * num_B_atoms

In [29]: db.get_chains()
Out[29]: ['A', 'B']

In [30]: db.update('chainID', chainC, chainID = ['B'])

In [31]: db.get_chains()
Out[31]: ['A', 'C']

Update x,y,z coordinates for structure translatation of [10,10,10]

In [32]: xyz_old = db.get_xyz()

In [33]: xyz = np.array(xyz_old) + 10.0

In [34]: db.update('x,y,z', xyz)

In [35]: xyz_new = db.get_xyz()

In [36]: print("old:\n", xyz_old)
old:
 [[-20.948, -13.418, 28.32], [-21.093, -12.112, 28.939], [-22.482, -11.566, 28.846], [-22.816, -10.393, 28.618], [-19.916, -11.178, 28.789], [-18.839, -11.701, 29.713], [-17.178, -11.517, 29.038], [-16.527, -13.173, 29.365], [-23.243, -12.593, 29.074], [-24.639, -12.681, 29.076], [-25.268, -12.252, 30.349], [-24.688, -12.207, 31.435], [-24.971, -14.147, 28.858], [-24.141, -14.712, 27.71], [-24.923, -15.776, 27.001], [-25.159, -16.851, 27.563], [-25.382, -15.458, 25.797], [-26.513, -11.973, 30.116], [-27.44, -11.567, 31.088], [-28.2, -12.824, 31.459], [-27.96, -13.91, 30.947], [-28.318, -10.497, 30.412], [-27.55, -9.329, 30.158], [-29.542, -10.173, 31.249], [-29.107, -12.704, 32.36], [-29.866, -13.85, 32.734], [-31.054, -13.904, 31.795], [-31.605, -14.952, 31.468], [-30.361, -13.645, 34.167], [-31.598, -14.458, 34.413], [-31.138, -15.884, 34.58], [-32.285, -13.993, 35.678], [-24.269, -24.311, 62.813], [-23.843, -24.77, 64.102], [-22.871, -25.894, 64.054], [-21.848, -25.947, 64.74], [-25.073, -25.455, 64.56], [-26.132, -24.381, 64.698], [-24.748, -26.175, 65.844], [-27.169, -24.675, 65.763], [-23.314, -26.842, 63.273], [-22.569, -28.021, 63.037], [-21.334, -27.605, 62.3], [-20.346, -28.31, 62.26], [-23.39, -28.992, 62.204], [-20.313, -24.801, 63.054], [-19.947, -23.992, 64.206], [-19.63, -22.543, 63.931], [-18.474, -22.128, 63.903], [-19.233, -24.631, 65.379], [-18.891, -26.083, 65.15], [-17.459, -26.383, 65.536], [-17.355, -27.043, 66.897], [-17.415, -28.509, 66.822]]

In [37]: print("new:\n", xyz_new)
new:
 [[-10.948, -3.4179999999999993, 38.32], [-11.093, -2.112, 38.939], [-12.482, -1.5660000000000007, 38.846000000000004], [-12.815999999999999, -0.3930000000000007, 38.617999999999995], [-9.916, -1.1780000000000008, 38.789], [-8.838999999999999, -1.7010000000000005, 39.713], [-7.178000000000001, -1.5169999999999995, 39.038], [-6.527000000000001, -3.173, 39.364999999999995], [-13.242999999999999, -2.593, 39.074], [-14.639, -2.680999999999999, 39.076], [-15.268, -2.2520000000000007, 40.349000000000004], [-14.687999999999999, -2.2070000000000007, 41.435], [-14.971, -4.147, 38.858000000000004], [-14.140999999999998, -4.712, 37.71], [-14.922999999999998, -5.776, 37.001000000000005], [-15.158999999999999, -6.850999999999999, 37.563], [-15.382000000000001, -5.458, 35.797], [-16.513, -1.9730000000000008, 40.116], [-17.44, -1.5670000000000002, 41.088], [-18.2, -2.824, 41.459], [-17.96, -3.91, 40.947], [-18.318, -0.4969999999999999, 40.412], [-17.55, 0.6709999999999994, 40.158], [-19.542, -0.17300000000000004, 41.248999999999995], [-19.107, -2.7040000000000006, 42.36], [-19.866, -3.8499999999999996, 42.734], [-21.054, -3.904, 41.795], [-21.605, -4.952, 41.468], [-20.361, -3.6449999999999996, 44.167], [-21.598, -4.458, 44.413], [-21.138, -5.884, 44.58], [-22.284999999999997, -3.9930000000000003, 45.678], [-14.268999999999998, -14.311, 72.813], [-13.843, -14.77, 74.102], [-12.870999999999999, -15.893999999999998, 74.054], [-11.847999999999999, -15.947, 74.74], [-15.073, -15.454999999999998, 74.56], [-16.132, -14.381, 74.698], [-14.748000000000001, -16.175, 75.844], [-17.169, -14.675, 75.763], [-13.314, -16.842, 73.273], [-12.568999999999999, -18.021, 73.037], [-11.334, -17.605, 72.3], [-10.346, -18.31, 72.25999999999999], [-13.39, -18.992, 72.20400000000001], [-10.312999999999999, -14.800999999999998, 73.054], [-9.947, -13.992, 74.206], [-9.629999999999999, -12.543, 73.931], [-8.474, -12.128, 73.90299999999999], [-9.233, -14.631, 75.379], [-8.890999999999998, -16.083, 75.15], [-7.459, -16.383, 75.536], [-7.355, -17.043, 76.897], [-7.414999999999999, -18.509, 76.822]]

Update a column using index, e.g. change the x coordinates of the first 10 atoms to 2:

In [38]: x = np.ones(10) + 1

In [39]: db.update_column('x', values=x, index=list(range(10)))

In [40]: db.print('serial, name, x')
serial	 name	 x
1	N	2.0
2	CA	2.0
3	C	2.0
4	O	2.0
5	CB	2.0
6	CG	2.0
7	SD	2.0
8	CE	2.0
9	N	2.0
10	CA	2.0
11	C	-15.268
12	O	-14.687999999999999
13	CB	-14.971
14	CG	-14.140999999999998
15	CD	-14.922999999999998
16	OE1	-15.158999999999999
17	NE2	-15.382000000000001
18	N	-16.513
19	CA	-17.44
20	C	-18.2
21	O	-17.96
22	CB	-18.318
23	OG1	-17.55
24	CG2	-19.542
25	N	-19.107
26	CA	-19.866
27	C	-21.054
28	O	-21.605
29	CB	-20.361
30	CG	-21.598
31	CD1	-21.138
32	CD2	-22.284999999999997
33	N	-14.268999999999998
34	CA	-13.843
35	C	-12.870999999999999
36	O	-11.847999999999999
37	CB	-15.073
38	CG1	-16.132
39	CG2	-14.748000000000001
40	CD1	-17.169
41	N	-13.314
42	CA	-12.568999999999999
43	C	-11.334
44	O	-10.346
45	CB	-13.39
46	N	-10.312999999999999
47	CA	-9.947
48	C	-9.629999999999999
49	O	-8.474
50	CB	-9.233
51	CG	-8.890999999999998
52	CD	-7.459
53	CE	-7.355
54	NZ	-7.414999999999999

Add a new column type with value high:

In [41]: db.add_column('type', value = 'high', coltype = 'str')

In [42]: db.print('serial, name, type')
serial	 name	 type
1	N	high
2	CA	high
3	C	high
4	O	high
5	CB	high
6	CG	high
7	SD	high
8	CE	high
9	N	high
10	CA	high
11	C	high
12	O	high
13	CB	high
14	CG	high
15	CD	high
16	OE1	high
17	NE2	high
18	N	high
19	CA	high
20	C	high
21	O	high
22	CB	high
23	OG1	high
24	CG2	high
25	N	high
26	CA	high
27	C	high
28	O	high
29	CB	high
30	CG	high
31	CD1	high
32	CD2	high
33	N	high
34	CA	high
35	C	high
36	O	high
37	CB	high
38	CG1	high
39	CG2	high
40	CD1	high
41	N	high
42	CA	high
43	C	high
44	O	high
45	CB	high
46	N	high
47	CA	high
48	C	high
49	O	high
50	CB	high
51	CG	high
52	CD	high
53	CE	high
54	NZ	high

PDB I/O

Read PDB file or data to a list:

In [43]: pdb = pdb2sql.read_pdb('./pdb/dummy.pdb')

In [44]: pdb
Out[44]: 
['ATOM      1  N   MET A   1     -20.948 -13.418  28.320  1.00 46.93           N  \n',
 'ATOM      2  CA  MET A   1     -21.093 -12.112  28.939  1.00 52.50           C  \n',
 'ATOM      3  C   MET A   1     -22.482 -11.566  28.846  1.00 52.55           C  \n',
 'ATOM      4  O   MET A   1     -22.816 -10.393  28.618  1.00 52.75           O  \n',
 'ATOM      5  CB  MET A   1     -19.916 -11.178  28.789  1.00 59.92           C  \n',
 'ATOM      6  CG  MET A   1     -18.839 -11.701  29.713  1.00 80.88           C  \n',
 'ATOM      7  SD  MET A   1     -17.178 -11.517  29.038  1.00 95.94           S  \n',
 'ATOM      8  CE  MET A   1     -16.527 -13.173  29.365  1.00 90.58           C  \n',
 'ATOM      9  N   GLN A   2     -23.243 -12.593  29.074  1.00 51.78           N  \n',
 'ATOM     10  CA  GLN A   2     -24.639 -12.681  29.076  1.00 52.49           C  \n',
 'ATOM     11  C   GLN A   2     -25.268 -12.252  30.349  1.00 42.74           C  \n',
 'ATOM     12  O   GLN A   2     -24.688 -12.207  31.435  1.00 47.12           O  \n',
 'ATOM     13  CB  GLN A   2     -24.971 -14.147  28.858  1.00 45.95           C  \n',
 'ATOM     14  CG  GLN A   2     -24.141 -14.712  27.710  1.00 53.26           C  \n',
 'ATOM     15  CD  GLN A   2     -24.923 -15.776  27.001  1.00 68.74           C  \n',
 'ATOM     16  OE1 GLN A   2     -25.159 -16.851  27.563  1.00 82.61           O  \n',
 'ATOM     17  NE2 GLN A   2     -25.382 -15.458  25.797  1.00 76.83           N  \n',
 'ATOM     18  N   THR A   3     -26.513 -11.973  30.116  1.00 21.94           N  \n',
 'ATOM     19  CA  THR A   3     -27.440 -11.567  31.088  1.00 15.55           C  \n',
 'ATOM     20  C   THR A   3     -28.200 -12.824  31.459  1.00  5.55           C  \n',
 'ATOM     21  O   THR A   3     -27.960 -13.910  30.947  1.00 13.48           O  \n',
 'ATOM     22  CB  THR A   3     -28.318 -10.497  30.412  1.00 14.60           C  \n',
 'ATOM     23  OG1 THR A   3     -27.550  -9.329  30.158  1.00  7.60           O  \n',
 'ATOM     24  CG2 THR A   3     -29.542 -10.173  31.249  1.00 14.52           C  \n',
 'ATOM     25  N   LEU A   4     -29.107 -12.704  32.360  1.00  2.00           N  \n',
 'ATOM     26  CA  LEU A   4     -29.866 -13.850  32.734  1.00  2.00           C  \n',
 'ATOM     27  C   LEU A   4     -31.054 -13.904  31.795  1.00 23.22           C  \n',
 'ATOM     28  O   LEU A   4     -31.605 -14.952  31.468  1.00 22.54           O  \n',
 'ATOM     29  CB  LEU A   4     -30.361 -13.645  34.167  1.00  2.00           C  \n',
 'ATOM     30  CG  LEU A   4     -31.598 -14.458  34.413  1.00  2.00           C  \n',
 'ATOM     31  CD1 LEU A   4     -31.138 -15.884  34.580  1.00 10.70           C  \n',
 'ATOM     32  CD2 LEU A   4     -32.285 -13.993  35.678  1.00  2.00           C  \n',
 'TER      32      LEU A\n',
 'ATOM     33  N   ILE B   5     -24.269 -24.311  62.813  1.00  9.48           N  \n',
 'ATOM     34  CA  ILE B   5     -23.843 -24.770  64.102  1.00 26.84           C  \n',
 'ATOM     35  C   ILE B   5     -22.871 -25.894  64.054  1.00 25.15           C  \n',
 'ATOM     36  O   ILE B   5     -21.848 -25.947  64.740  1.00 26.70           O  \n',
 'ATOM     37  CB  ILE B   5     -25.073 -25.455  64.560  1.00  2.00           C  \n',
 'ATOM     38  CG1 ILE B   5     -26.132 -24.381  64.698  1.00  8.15           C  \n',
 'ATOM     39  CG2 ILE B   5     -24.748 -26.175  65.844  1.00 15.39           C  \n',
 'ATOM     40  CD1 ILE B   5     -27.169 -24.675  65.763  1.00  4.14           C  \n',
 'ATOM     41  N   ALA B   6     -23.314 -26.842  63.273  1.00  5.60           N  \n',
 'ATOM     42  CA  ALA B   6     -22.569 -28.021  63.037  1.00  6.15           C  \n',
 'ATOM     43  C   ALA B   6     -21.334 -27.605  62.300  1.00  2.00           C  \n',
 'ATOM     44  O   ALA B   6     -20.346 -28.310  62.260  1.00 19.27           O  \n',
 'ATOM     45  CB  ALA B   6     -23.390 -28.992  62.204  1.00 28.29           C  \n',
 'ATOM     46  N   LYS B   7     -20.313 -24.801  63.054  1.00 42.89           N  \n',
 'ATOM     47  CA  LYS B   7     -19.947 -23.992  64.206  1.00 35.06           C  \n',
 'ATOM     48  C   LYS B   7     -19.630 -22.543  63.931  1.00 28.97           C  \n',
 'ATOM     49  O   LYS B   7     -18.474 -22.128  63.903  1.00 23.79           O  \n',
 'ATOM     50  CB  LYS B   7     -19.233 -24.631  65.379  1.00 29.85           C  \n',
 'ATOM     51  CG  LYS B   7     -18.891 -26.083  65.150  1.00 33.26           C  \n',
 'ATOM     52  CD  LYS B   7     -17.459 -26.383  65.536  1.00 40.19           C  \n',
 'ATOM     53  CE  LYS B   7     -17.355 -27.043  66.897  1.00 63.40           C  \n',
 'ATOM     54  NZ  LYS B   7     -17.415 -28.509  66.822  1.00 66.95           N  \n',
 'END\n']

Convert SQL data to PDB-formated data:

In [45]: pdb = db.sql2pdb()

In [46]: pdb
Out[46]: 
['ATOM      1  N   MET A   1       2.000  -3.418  38.320  1.00 46.93           N  ',
 'ATOM      2  CA  MET A   1       2.000  -2.112  38.939  1.00 52.50           C  ',
 'ATOM      3  C   MET A   1       2.000  -1.566  38.846  1.00 52.55           C  ',
 'ATOM      4  O   MET A   1       2.000  -0.393  38.618  1.00 52.75           O  ',
 'ATOM      5  CB  MET A   1       2.000  -1.178  38.789  1.00 59.92           C  ',
 'ATOM      6  CG  MET A   1       2.000  -1.701  39.713  1.00 80.88           C  ',
 'ATOM      7  SD  MET A   1       2.000  -1.517  39.038  1.00 95.94           S  ',
 'ATOM      8  CE  MET A   1       2.000  -3.173  39.365  1.00 90.58           C  ',
 'ATOM      9  N   GLN A   2       2.000  -2.593  39.074  1.00 51.78           N  ',
 'ATOM     10  CA  GLN A   2       2.000  -2.681  39.076  1.00 52.49           C  ',
 'ATOM     11  C   GLN A   2     -15.268  -2.252  40.349  1.00 42.74           C  ',
 'ATOM     12  O   GLN A   2     -14.688  -2.207  41.435  1.00 47.12           O  ',
 'ATOM     13  CB  GLN A   2     -14.971  -4.147  38.858  1.00 45.95           C  ',
 'ATOM     14  CG  GLN A   2     -14.141  -4.712  37.710  1.00 53.26           C  ',
 'ATOM     15  CD  GLN A   2     -14.923  -5.776  37.001  1.00 68.74           C  ',
 'ATOM     16  OE1 GLN A   2     -15.159  -6.851  37.563  1.00 82.61           O  ',
 'ATOM     17  NE2 GLN A   2     -15.382  -5.458  35.797  1.00 76.83           N  ',
 'ATOM     18  N   THR A   3     -16.513  -1.973  40.116  1.00 21.94           N  ',
 'ATOM     19  CA  THR A   3     -17.440  -1.567  41.088  1.00 15.55           C  ',
 'ATOM     20  C   THR A   3     -18.200  -2.824  41.459  1.00  5.55           C  ',
 'ATOM     21  O   THR A   3     -17.960  -3.910  40.947  1.00 13.48           O  ',
 'ATOM     22  CB  THR A   3     -18.318  -0.497  40.412  1.00 14.60           C  ',
 'ATOM     23  OG1 THR A   3     -17.550   0.671  40.158  1.00  7.60           O  ',
 'ATOM     24  CG2 THR A   3     -19.542  -0.173  41.249  1.00 14.52           C  ',
 'ATOM     25  N   LEU A   4     -19.107  -2.704  42.360  1.00  2.00           N  ',
 'ATOM     26  CA  LEU A   4     -19.866  -3.850  42.734  1.00  2.00           C  ',
 'ATOM     27  C   LEU A   4     -21.054  -3.904  41.795  1.00 23.22           C  ',
 'ATOM     28  O   LEU A   4     -21.605  -4.952  41.468  1.00 22.54           O  ',
 'ATOM     29  CB  LEU A   4     -20.361  -3.645  44.167  1.00  2.00           C  ',
 'ATOM     30  CG  LEU A   4     -21.598  -4.458  44.413  1.00  2.00           C  ',
 'ATOM     31  CD1 LEU A   4     -21.138  -5.884  44.580  1.00 10.70           C  ',
 'ATOM     32  CD2 LEU A   4     -22.285  -3.993  45.678  1.00  2.00           C  ',
 'ATOM     33  N   ILE C   5     -14.269 -14.311  72.813  1.00  9.48           N  ',
 'ATOM     34  CA  ILE C   5     -13.843 -14.770  74.102  1.00 26.84           C  ',
 'ATOM     35  C   ILE C   5     -12.871 -15.894  74.054  1.00 25.15           C  ',
 'ATOM     36  O   ILE C   5     -11.848 -15.947  74.740  1.00 26.70           O  ',
 'ATOM     37  CB  ILE C   5     -15.073 -15.455  74.560  1.00  2.00           C  ',
 'ATOM     38  CG1 ILE C   5     -16.132 -14.381  74.698  1.00  8.15           C  ',
 'ATOM     39  CG2 ILE C   5     -14.748 -16.175  75.844  1.00 15.39           C  ',
 'ATOM     40  CD1 ILE C   5     -17.169 -14.675  75.763  1.00  4.14           C  ',
 'ATOM     41  N   ALA C   6     -13.314 -16.842  73.273  1.00  5.60           N  ',
 'ATOM     42  CA  ALA C   6     -12.569 -18.021  73.037  1.00  6.15           C  ',
 'ATOM     43  C   ALA C   6     -11.334 -17.605  72.300  1.00  2.00           C  ',
 'ATOM     44  O   ALA C   6     -10.346 -18.310  72.260  1.00 19.27           O  ',
 'ATOM     45  CB  ALA C   6     -13.390 -18.992  72.204  1.00 28.29           C  ',
 'ATOM     46  N   LYS C   7     -10.313 -14.801  73.054  1.00 42.89           N  ',
 'ATOM     47  CA  LYS C   7      -9.947 -13.992  74.206  1.00 35.06           C  ',
 'ATOM     48  C   LYS C   7      -9.630 -12.543  73.931  1.00 28.97           C  ',
 'ATOM     49  O   LYS C   7      -8.474 -12.128  73.903  1.00 23.79           O  ',
 'ATOM     50  CB  LYS C   7      -9.233 -14.631  75.379  1.00 29.85           C  ',
 'ATOM     51  CG  LYS C   7      -8.891 -16.083  75.150  1.00 33.26           C  ',
 'ATOM     52  CD  LYS C   7      -7.459 -16.383  75.536  1.00 40.19           C  ',
 'ATOM     53  CE  LYS C   7      -7.355 -17.043  76.897  1.00 63.40           C  ',
 'ATOM     54  NZ  LYS C   7      -7.415 -18.509  76.822  1.00 66.95           N  ']

Write PDB file from SQL database:

In [47]: db.exportpdb('./pdb/test.pdb')

# show the test.pdb file
In [48]: ls ./pdb
1AK4_10w.pdb  3CRO.pdb	 dummy.pdb	      ref.pdb
1AK4_5w.pdb   decoy.pdb  dummy_transform.pdb  test.pdb

Interface calculation

Create an interface SQL database instance:

In [49]: from pdb2sql import interface

# use pdb2sql instance as input
In [50]: from pdb2sql import pdb2sql

In [51]: pdb_db = pdb2sql('./pdb/3CRO.pdb')

In [52]: db = interface(pdb_db)

# or use pdb file as input
In [53]: db = interface('./pdb/3CRO.pdb')

Interface atoms

In [54]: itf_atom = db.get_contact_atoms(cutoff = 3)

In [55]: itf_atom_pair = db.get_contact_atoms(cutoff = 3, return_contact_pairs=True)

In [56]: print("interface atom:\n", itf_atom)
interface atom:
 {'A': [13, 33, 34, 35, 55, 56, 58, 75, 97, 98, 116, 118, 137, 138, 158, 179, 180, 200, 215, 216, 219, 239, 256, 257, 258, 259, 276, 277, 279, 294, 295, 298, 316, 318, 338, 358, 360, 380, 401], 'B': [438, 459, 479, 480, 500, 519, 520, 540, 561, 562, 564, 583, 584, 585, 604, 606, 625, 646, 648, 649, 666, 668, 688, 706, 708, 728, 729, 748, 750, 770, 771, 788, 789, 791, 808, 810]}

In [57]: print("interface atom pairs:\n", itf_atom_pair)
interface atom pairs:
 {13: [810], 33: [810], 34: [791, 810], 35: [808, 810], 55: [770, 791], 56: [789], 58: [788], 75: [771], 97: [750], 98: [748], 116: [728, 729], 118: [728], 137: [708], 138: [706], 158: [688], 179: [668], 180: [666], 200: [646], 215: [649], 216: [648, 649], 219: [625], 239: [604], 256: [606], 257: [583, 584], 258: [583], 259: [583], 276: [585], 277: [562], 279: [561], 294: [564], 295: [564], 298: [519, 540], 316: [520], 318: [519, 520], 338: [500], 358: [480], 360: [479, 480], 380: [459], 401: [438]}

Interface residues

In [58]: itf_residue = db.get_contact_residues(cutoff = 3)

In [59]: itf_residue_pair = db.get_contact_residues(cutoff = 3, return_contact_pairs=True)

In [60]: itf_residue
Out[60]: 
{'A': [('A', 1, 'DA'),
  ('A', 2, 'DA'),
  ('A', 3, 'DG'),
  ('A', 4, 'DT'),
  ('A', 5, 'DA'),
  ('A', 6, 'DC'),
  ('A', 7, 'DA'),
  ('A', 8, 'DA'),
  ('A', 9, 'DA'),
  ('A', 10, 'DC'),
  ('A', 11, 'DT'),
  ('A', 12, 'DT'),
  ('A', 13, 'DT'),
  ('A', 14, 'DC'),
  ('A', 15, 'DT'),
  ('A', 16, 'DT'),
  ('A', 17, 'DG'),
  ('A', 18, 'DT'),
  ('A', 19, 'DA'),
  ('A', 20, 'DT')],
 'B': [('B', 2, 'DA'),
  ('B', 3, 'DT'),
  ('B', 4, 'DA'),
  ('B', 5, 'DC'),
  ('B', 6, 'DA'),
  ('B', 7, 'DA'),
  ('B', 8, 'DG'),
  ('B', 9, 'DA'),
  ('B', 10, 'DA'),
  ('B', 11, 'DA'),
  ('B', 12, 'DG'),
  ('B', 13, 'DT'),
  ('B', 14, 'DT'),
  ('B', 15, 'DT'),
  ('B', 16, 'DG'),
  ('B', 17, 'DT'),
  ('B', 18, 'DA'),
  ('B', 19, 'DC'),
  ('B', 20, 'DT')]}

In [61]: itf_residue_pair
Out[61]: 
{('A', 1, 'DA'): [('B', 20, 'DT')],
 ('A', 2, 'DA'): [('B', 19, 'DC'), ('B', 20, 'DT')],
 ('A', 3, 'DG'): [('B', 18, 'DA'), ('B', 19, 'DC')],
 ('A', 4, 'DT'): [('B', 18, 'DA')],
 ('A', 5, 'DA'): [('B', 17, 'DT')],
 ('A', 6, 'DC'): [('B', 16, 'DG')],
 ('A', 7, 'DA'): [('B', 15, 'DT')],
 ('A', 8, 'DA'): [('B', 14, 'DT')],
 ('A', 9, 'DA'): [('B', 13, 'DT')],
 ('A', 10, 'DC'): [('B', 12, 'DG')],
 ('A', 11, 'DT'): [('B', 11, 'DA'), ('B', 12, 'DG')],
 ('A', 12, 'DT'): [('B', 10, 'DA')],
 ('A', 13, 'DT'): [('B', 9, 'DA'), ('B', 10, 'DA')],
 ('A', 14, 'DC'): [('B', 8, 'DG'), ('B', 9, 'DA')],
 ('A', 15, 'DT'): [('B', 6, 'DA'), ('B', 7, 'DA'), ('B', 8, 'DG')],
 ('A', 16, 'DT'): [('B', 6, 'DA')],
 ('A', 17, 'DG'): [('B', 5, 'DC')],
 ('A', 18, 'DT'): [('B', 4, 'DA')],
 ('A', 19, 'DA'): [('B', 3, 'DT')],
 ('A', 20, 'DT'): [('B', 2, 'DA')]}

Structure superposition

pdb2sql allows to superpose two structure on top of each other either using the full structure or with selection keywords. For example to superpose the chain A of two PDB one can use :

In [62]: from pdb2sql import superpose

In [63]: ref = pdb2sql('./pdb/1AK4_5w.pdb')

In [64]: decoy = pdb2sql('./pdb/1AK4_10w.pdb')

In [65]: superposed_decoy = superpose(decoy, ref, chainID='A', export=True)

This will export a new PDB file containining the structure of the decoy superposed onto the reference.

Structure alignement

pdb2sql allows to align structure along a specific axis

In [66]: from pdb2sql import align

In [67]: db = pdb2sql('./pdb/1AK4_10w.pdb')

In [68]: aligned_db = align(db, axis='z', export=True)

The alignement can also consider only a subpart of the complex using the selection keywords:

In [69]: aligned_db = align(db, axis='z', chainID='A')

There the chain A will be aligned along the z-axis

This will create a new PDB file containing the structure aligned along the z-axis. It is also possible aligning an interface in a given plane

In [70]: from pdb2sql import align_interface

In [71]: db = pdb2sql('./pdb/3CRO.pdb')

In [72]: aligned_db = align_interface(db, plane='xy', export=True)

By default the interface formed by chain A and B will be considered. In case multiple chains are present in the structure it is possible to specify wich interface to consider:

In [73]: aligned_db = align_interface(db, plane='xy', chain1='L', chain2='R')

There the interface between chain L and R will be considered. Note that any other selection keyword can be used to specify which interface to account for.

Structure similarity calculation

Create a StructureSimilarity instance:

In [74]: from pdb2sql.StructureSimilarity import StructureSimilarity

In [75]: sim = StructureSimilarity('./pdb/decoy.pdb', './pdb/ref.pdb')

interface RMSD

In [76]: irmsd_fast = sim.compute_irmsd_fast()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-76-0ffc09c59f8a> in <module>
----> 1 irmsd_fast = sim.compute_irmsd_fast()

~/checkouts/readthedocs.org/user_builds/pdb2sql/envs/latest/lib/python3.7/site-packages/pdb2sql/StructureSimilarity.py in compute_irmsd_fast(self, izone, method, cutoff, check)
    307         if check or self.enforce_residue_matching:
    308 
--> 309             self.check_residues()
    310 
    311             data_decoy = self.get_data_zone_backbone(

~/checkouts/readthedocs.org/user_builds/pdb2sql/envs/latest/lib/python3.7/site-packages/pdb2sql/StructureSimilarity.py in check_residues(self, **kwargs)
    100                 if self.enforce_residue_matching == True:
    101                     raise ValueError(
--> 102                         'Atoms not identical in ref and decoy.\n Set enforce_residue_matching=False to bypass this error.')
    103                 else:
    104                     warnings.warn('Atoms not identical in ref and decoy.')

ValueError: Atoms not identical in ref and decoy.
 Set enforce_residue_matching=False to bypass this error.

In [77]: irmsd_pdb2sql = sim.compute_irmsd_pdb2sql()

In [78]: irmsd_fast
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-78-d58ceea2462b> in <module>
----> 1 irmsd_fast

NameError: name 'irmsd_fast' is not defined

In [79]: irmsd_pdb2sql
Out[79]: 1.135

ligand RMSD

In [80]: lrmsd_fast = sim.compute_lrmsd_fast()

In [81]: lrmsd_pdb2sql = sim.compute_lrmsd_pdb2sql()

In [82]: lrmsd_fast
Out[82]: 6.655

In [83]: lrmsd_pdb2sql
Out[83]: 6.655

FNAT

Calculate the fraction of native contacts:

In [84]: fnat_fast = sim.compute_fnat_fast()

In [85]: fnat_pdb2sql = sim.compute_fnat_pdb2sql()

In [86]: fnat_fast
Out[86]: 0.790698

In [87]: fnat_pdb2sql
Out[87]: 0.790698

DockQ score

In [88]: dockQ = sim.compute_DockQScore(fnat_fast, lrmsd_fast, irmsd_fast)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-88-a18b2f03db3a> in <module>
----> 1 dockQ = sim.compute_DockQScore(fnat_fast, lrmsd_fast, irmsd_fast)

NameError: name 'irmsd_fast' is not defined

In [89]: dockQ
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-89-2012a35270fa> in <module>
----> 1 dockQ

NameError: name 'dockQ' is not defined

Structure transformation

Create SQL instance:

In [90]: from pdb2sql import transform

In [91]: db = pdb2sql('./pdb/dummy_transform.pdb')

The atom coordinates are:

In [92]: db.get_xyz()
Out[92]: 
[[1.0, 0.0, 0.0],
 [-1.0, 0.0, 0.0],
 [0.0, 1.0, 0.0],
 [0.0, -1.0, 0.0],
 [0.0, 0.0, 1.0],
 [0.0, 0.0, -1.0]]

Rotations

Rotate structures 180 degrees along the x-axis:

In [93]: angle = np.pi

In [94]: axis = (1., 0., 0.)

In [95]: transform.rot_axis(db, axis, angle)

In [96]: db.get_xyz()
Out[96]: 
[[1.0, 0.0, 0.0],
 [-1.0, 0.0, 0.0],
 [0.0, -1.0, 1.2246467991473532e-16],
 [0.0, 1.0, -1.2246467991473532e-16],
 [0.0, -1.2246467991473532e-16, -1.0],
 [0.0, 1.2246467991473532e-16, 1.0]]

Get random rotation axis and angle:

In [97]: axis, angle = transform.get_rot_axis_angle()

In [98]: axis
Out[98]: [-0.06788982155971889, 0.7589085252919632, -0.6476486874593103]

In [99]: angle
Out[99]: 0.06654169272612691

Translations

Translate structure 5Å along y-axis:

In [100]: trans_vec = np.array([0,5,0])

In [101]: transform.translation(db, trans_vec)

In [102]: db.get_xyz()
Out[102]: 
[[1.0, 5.0, 0.0],
 [-1.0, 5.0, 0.0],
 [0.0, 4.0, 1.2246467991473532e-16],
 [0.0, 6.0, -1.2246467991473532e-16],
 [0.0, 5.0, -1.0],
 [0.0, 5.0, 1.0]]