Skip to content

Commit cb41a3b

Browse files
authored
add unambig-ti subcommand (#32)
* add unambig_ti stub * add `unambig-ti` subcommand * add test data * cleaning * update docs * cleaning * bump version * update `README.md`
1 parent d035a01 commit cb41a3b

File tree

8 files changed

+349
-134
lines changed

8 files changed

+349
-134
lines changed

Cargo.lock

Lines changed: 92 additions & 121 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "haddock-restraints"
3-
version = "0.6.2"
3+
version = "0.7.0"
44
edition = "2021"
55
description = "Generate restraints to be used in HADDOCK"
66
license = "MIT"

README.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ A standalone command-line application to generate restraints to be used in HADDO
1818
- [`restraint`: Generate Unambiguous restraints to keep molecules together during docking](https://www.bonvinlab.org/haddock-restraints/restraint.html)
1919
- [`interface`: List residues in the interface](https://www.bonvinlab.org/haddock-restraints/interface.html)
2020
- [`z`: Generate Z-restraints for a protein](https://www.bonvinlab.org/haddock-restraints/z.html)
21+
- [`unambig-ti`: Generate unambiguous true-interface restraints from a PDB file](https://www.bonvinlab.org/haddock-restraints/unambig-ti.html)
2122

2223
## Usage
2324

@@ -31,9 +32,9 @@ OR
3132

3233
- Install it with [`cargo`](https://www.rust-lang.org/tools/install)
3334

34-
```bash
35-
cargo install haddock-restraints
36-
```
35+
```bash
36+
cargo install haddock-restraints
37+
```
3738

3839
## Execute
3940

@@ -44,12 +45,13 @@ Generate restraints to be used in HADDOCK
4445
Usage: haddock-restraints <COMMAND>
4546

4647
Commands:
47-
tbl Generate TBL file from input file
48-
ti Generate true-interface restraints from a PDB file
49-
restraint Generate Unambiguous restraints to keep molecules together during docking
50-
interface List residues in the interface
51-
z Generate Z-restraints for a protein
52-
help Print this message or the help of the given subcommand(s)
48+
tbl Generate TBL file from input file
49+
ti Generate true-interface restraints from a PDB file
50+
unambig-ti Generate unambiguous true-interface restraints from a PDB file
51+
restraint Generate unambiguous restraints to keep molecules together during docking
52+
interface List residues in the interface
53+
z Generate Z-restraints for a protein
54+
help Print this message or the help of the given subcommand(s)
5355

5456
Options:
5557
-h, --help Print help

docs/src/unambig-ti.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Generate _unambiguous true-interface_ restraints from a PDB file
2+
3+
This is a more strict version of the true-interface restraints that assigns
4+
specific pairs to each residue in the interface.
5+
6+
For each atom in each of the chains, the algorithm defines the pairs based on
7+
atomic interactions between any atom of a given residue in the interface and
8+
any other atom in an interacting chain. The restraints are later defined
9+
considering the CA-CA distances of such pairs; thus allowing for
10+
side-chain flexibility.
11+
12+
## Usage
13+
14+
To run the `unambig-ti` subcommand, you just need to provide the path to the
15+
PDB file and the cutoff distance. For example:
16+
17+
```bash
18+
haddock-restraints unambig-ti path/to/complex.pdb 5.0 > unambig.tbl
19+
```

docs/src/usage.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# Usage
22

3-
`haddock-restraints` is a command-line tool that is organized with subcommands. Each subcommand has its own set of options and arguments. The main subcommands are:
3+
`haddock-restraints` is a command-line tool that is organized with subcommands.
4+
Each subcommand has its own set of options and arguments. The main subcommands are:
45

56
- [`tbl`: Generate .tbl file from a configuration file](./tbl.md)
67

@@ -11,3 +12,5 @@
1112
- [`interface`: List residues in the interface](./interface.md)
1213

1314
- [`z`: Generate restraints to keep the molecule aligned to the Z-axis](./z.md)
15+
16+
- [`unambig-ti`: Generate unambiguous true-interface restraints](./unambig-ti.md)

src/main.rs

Lines changed: 88 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,14 @@ enum Commands {
3333
#[arg(help = "Cutoff distance for interface residues")]
3434
cutoff: f64,
3535
},
36-
#[command(about = "Generate Unambiguous restraints to keep molecules together during docking")]
36+
#[command(about = "Generate unambiguous true-interface restraints from a PDB file")]
37+
UnambigTi {
38+
#[arg(help = "PDB file")]
39+
input: String,
40+
#[arg(help = "Cutoff distance for interface residues")]
41+
cutoff: f64,
42+
},
43+
#[command(about = "Generate unambiguous restraints to keep molecules together during docking")]
3744
Restraint {
3845
#[arg(help = "PDB file")]
3946
input: String,
@@ -45,7 +52,6 @@ enum Commands {
4552
#[arg(help = "Cutoff distance for interface residues")]
4653
cutoff: f64,
4754
},
48-
4955
#[command(about = "Generate Z-restraints for a protein")]
5056
Z {
5157
#[arg(required = true, help = "Input file")]
@@ -89,6 +95,9 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
8995
Commands::Ti { input, cutoff } => {
9096
let _ = true_interface(input, cutoff);
9197
}
98+
Commands::UnambigTi { input, cutoff } => {
99+
let _ = unambig_ti(input, cutoff);
100+
}
92101
Commands::Restraint { input } => {
93102
let _ = restraint_bodies(input);
94103
}
@@ -172,6 +181,74 @@ fn gen_tbl(input_file: &str) {
172181
println!("{}", tbl);
173182
}
174183

184+
/// Generates Unambiguous Topological Interactions (TIs) from a protein structure.
185+
///
186+
/// This function reads a PDB file, identifies the closest residue pairs based on a specified distance cutoff,
187+
/// and creates unambiguous interactors for each residue pair.
188+
///
189+
/// # Arguments
190+
///
191+
/// * input_file - A string slice that holds the path to the input PDB file.
192+
/// * cutoff - A reference to a f64 value specifying the distance cutoff (in Angstroms) for determining interactions.
193+
///
194+
/// # Returns
195+
///
196+
/// A Result<String, Box<dyn Error>> containing the generated TBL (Topological Restraints List) if successful.
197+
///
198+
/// # Functionality
199+
///
200+
/// 1. Loads the PDB file using the structure::load_pdb function.
201+
/// 2. Finds the closest residue pairs within the specified distance cutoff.
202+
/// 3. Creates Interactor instances for each residue pair.
203+
/// 4. Assigns chains, active/passive residues, and atoms to the interactors.
204+
/// 5. Generates the Topological Restraints List (TBL) using the created interactors.
205+
/// 6. Prints the generated TBL to stdout.
206+
///
207+
/// # Panics
208+
///
209+
/// This function will panic if:
210+
/// - The PDB file cannot be opened or parsed.
211+
fn unambig_ti(input_file: &str, cutoff: &f64) -> Result<String, Box<dyn Error>> {
212+
let pdb = match structure::load_pdb(input_file) {
213+
Ok(pdb) => pdb,
214+
Err(e) => {
215+
panic!("Error opening PDB file: {:?}", e);
216+
}
217+
};
218+
let pairs = structure::get_closest_residue_pairs(&pdb, *cutoff);
219+
220+
let mut interactors: Vec<Interactor> = Vec::new();
221+
let mut counter = 0;
222+
pairs.iter().for_each(|g| {
223+
let mut interactor_i = Interactor::new(counter);
224+
counter += 1;
225+
let mut interactor_j = Interactor::new(counter);
226+
interactor_j.add_target(counter - 1);
227+
interactor_i.add_target(counter);
228+
counter += 1;
229+
230+
interactor_i.set_chain(g.chain_i.as_str());
231+
interactor_i.set_active(vec![g.res_i as i16]);
232+
interactor_i.set_active_atoms(vec![g.atom_i.clone()]);
233+
interactor_i.set_target_distance(g.distance);
234+
235+
interactor_j.set_chain(g.chain_j.as_str());
236+
interactor_j.set_passive(vec![g.res_j as i16]);
237+
interactor_j.set_passive_atoms(vec![g.atom_j.clone()]);
238+
239+
interactors.push(interactor_i);
240+
interactors.push(interactor_j);
241+
});
242+
243+
// Make the restraints
244+
let air = Air::new(interactors);
245+
let tbl = air.gen_tbl().unwrap();
246+
247+
println!("{}", tbl);
248+
249+
Ok(tbl)
250+
}
251+
175252
/// Analyzes the true interface of a protein structure and generates Ambiguous Interaction Restraints (AIRs).
176253
///
177254
/// This function reads a PDB file, identifies the true interface between chains based on a distance cutoff,
@@ -645,4 +722,13 @@ assign ( resid 2 and segid A and name CA ) ( resid 8 and segid A and name CA ) 1
645722
Err(_e) => (),
646723
}
647724
}
725+
#[test]
726+
fn test_unambigti() {
727+
let expected_tbl = "assign ( resid 2 and segid A and name CA ) ( resid 10 and segid B and name CA ) 9.1 2.0 0.0\n\n";
728+
729+
match unambig_ti("tests/data/two_res.pdb", &5.0) {
730+
Ok(tbl) => assert_eq!(tbl, expected_tbl),
731+
Err(_e) => (),
732+
}
733+
}
648734
}

src/structure.rs

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,17 @@ pub struct Bead {
1818
pub position: Vector3<f64>,
1919
}
2020

21+
#[derive(Debug, Clone)]
22+
pub struct Pair {
23+
pub chain_i: String,
24+
pub res_i: isize,
25+
pub atom_i: String,
26+
pub chain_j: String,
27+
pub res_j: isize,
28+
pub atom_j: String,
29+
pub distance: f64,
30+
}
31+
2132
/// Performs a neighbor search for residues within a specified radius of target residues.
2233
///
2334
/// This function uses a K-d tree to efficiently find residues that are within a given radius
@@ -623,6 +634,88 @@ pub fn process_pdb(input_pdb: &str) -> BufReader<Cursor<Vec<u8>>> {
623634
pdb_handler::pad_lines(temp_path)
624635
}
625636

637+
/// Finds the closest residue pairs between different protein chains within a specified distance cutoff.
638+
///
639+
/// This function analyzes inter-chain interactions by identifying the closest atom pairs between residues
640+
/// from different chains, using the alpha carbon (CA) atoms for distance calculation.
641+
///
642+
/// # Arguments
643+
///
644+
/// * pdb - A reference to a parsed PDB structure.
645+
/// * cutoff - A floating-point value specifying the maximum distance (in Angstroms) for considering residue pairs.
646+
///
647+
/// # Returns
648+
///
649+
/// A Vec<Pair> containing the closest residue pairs within the specified cutoff, sorted by distance.
650+
///
651+
/// # Functionality
652+
///
653+
/// 1. Iterates through all unique chain pairs in the PDB structure.
654+
/// 2. For each chain pair, finds the closest atoms between residues.
655+
/// 3. Selects pairs where the inter-residue distance is less than the specified cutoff.
656+
/// 4. Uses alpha carbon (CA) atoms for distance calculation and pair representation.
657+
/// 5. Sorts the resulting pairs by their inter-alpha carbon distance.
658+
///
659+
/// # Notes
660+
///
661+
/// - Only inter-chain interactions are considered.
662+
/// - The distance is calculated based on the closest atom pairs between residues.
663+
/// - The resulting pairs are sorted from shortest to longest distance.
664+
pub fn get_closest_residue_pairs(pdb: &pdbtbx::PDB, cutoff: f64) -> Vec<Pair> {
665+
let mut closest_pairs = Vec::new();
666+
667+
let chains: Vec<_> = pdb.chains().collect();
668+
669+
for i in 0..chains.len() {
670+
for j in (i + 1)..chains.len() {
671+
let chain_i = &chains[i];
672+
let chain_j = &chains[j];
673+
674+
for res_i in chain_i.residues() {
675+
let mut min_distance = f64::MAX;
676+
let mut closest_pair = None;
677+
678+
for res_j in chain_j.residues() {
679+
let mut atom_dist = f64::MAX;
680+
for atom_i in res_i.atoms() {
681+
for atom_j in res_j.atoms() {
682+
let distance = atom_i.distance(atom_j);
683+
if distance < atom_dist {
684+
atom_dist = distance;
685+
}
686+
}
687+
}
688+
if atom_dist < min_distance {
689+
min_distance = atom_dist;
690+
closest_pair = Some((res_j, res_i));
691+
}
692+
}
693+
694+
if min_distance < cutoff {
695+
if let Some((res_j, res_i)) = closest_pair {
696+
let ca_i = res_i.atoms().find(|atom| atom.name() == "CA").unwrap();
697+
let ca_j = res_j.atoms().find(|atom| atom.name() == "CA").unwrap();
698+
let ca_ca_dist = ca_i.distance(ca_j);
699+
closest_pairs.push(Pair {
700+
chain_i: chain_i.id().to_string(),
701+
res_i: res_i.serial_number(),
702+
atom_i: ca_i.name().to_string(),
703+
chain_j: chain_j.id().to_string(),
704+
res_j: res_j.serial_number(),
705+
atom_j: ca_j.name().to_string(),
706+
distance: ca_ca_dist,
707+
});
708+
}
709+
}
710+
}
711+
}
712+
}
713+
714+
closest_pairs.sort_by(|a, b| a.distance.partial_cmp(&b.distance).unwrap());
715+
716+
closest_pairs
717+
}
718+
626719
#[cfg(test)]
627720
mod tests {
628721

@@ -686,4 +779,27 @@ mod tests {
686779
.filter(|line| line.starts_with("ATOM"))
687780
.all(|line| line.len() == 80));
688781
}
782+
#[test]
783+
fn test_get_closest_residue_pairs() {
784+
let pdb = load_pdb("tests/data/two_res.pdb").unwrap();
785+
let observed_pairs = get_closest_residue_pairs(&pdb, 5.0);
786+
let expected_pairs = vec![Pair {
787+
chain_i: "A".to_string(),
788+
chain_j: "B".to_string(),
789+
atom_i: "CA".to_string(),
790+
atom_j: "CA".to_string(),
791+
res_i: 2,
792+
res_j: 10,
793+
distance: 9.1,
794+
}];
795+
assert_eq!(observed_pairs.len(), expected_pairs.len());
796+
let pair = &observed_pairs[0];
797+
assert_eq!(pair.chain_i, expected_pairs[0].chain_i);
798+
assert_eq!(pair.chain_j, expected_pairs[0].chain_j);
799+
assert_eq!(pair.atom_i, expected_pairs[0].atom_i);
800+
assert_eq!(pair.atom_j, expected_pairs[0].atom_j);
801+
assert_eq!(pair.res_i, expected_pairs[0].res_i);
802+
assert_eq!(pair.res_j, expected_pairs[0].res_j);
803+
assert!((pair.distance - 9.1).abs() < 0.1);
804+
}
689805
}

tests/data/two_res.pdb

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
ATOM 8 N THR A 2 15.115 11.555 5.265 1.00 7.81 N
2+
ATOM 9 CA THR A 2 13.856 11.469 6.066 1.00 8.31 C
3+
ATOM 10 C THR A 2 14.164 10.785 7.379 1.00 5.80 C
4+
ATOM 11 O THR A 2 14.993 9.862 7.443 1.00 6.94 O
5+
ATOM 12 CB THR A 2 12.732 10.711 5.261 1.00 10.32 C
6+
ATOM 13 CG2 THR A 2 12.484 11.442 3.895 1.00 11.90 C
7+
ATOM 14 OG1 THR A 2 13.308 9.439 4.926 1.00 12.81 O
8+
ATOM 47 N ARG B 10 7.647 4.909 10.005 1.00 3.73 N
9+
ATOM 48 CA ARG B 10 8.496 4.609 8.837 1.00 3.38 C
10+
ATOM 49 C ARG B 10 7.798 3.609 7.876 1.00 3.47 C
11+
ATOM 50 O ARG B 10 7.878 3.778 6.651 1.00 4.67 O
12+
ATOM 51 CB ARG B 10 9.847 4.020 9.305 1.00 3.95 C
13+
ATOM 52 CG ARG B 10 10.752 3.607 8.149 1.00 4.55 C
14+
ATOM 53 CD ARG B 10 11.226 4.699 7.244 1.00 5.89 C
15+
ATOM 54 NE ARG B 10 12.143 5.571 8.035 1.00 6.20 N
16+
ATOM 55 CZ ARG B 10 12.758 6.609 7.443 1.00 7.52 C
17+
ATOM 56 NH1 ARG B 10 12.539 6.932 6.158 1.00 10.68 N1+
18+
ATOM 57 NH2 ARG B 10 13.601 7.322 8.202 1.00 9.48 N

0 commit comments

Comments
 (0)