This repository provides a pipeline for the classification of Methyltransferases (MTases) using Hidden Markov Models (HMMs). The pipeline consists of three main steps: installing the required packages, conducting HMMer searches, and detecting regions for classification.
Before running the pipeline, you will need to install the required packages. Execute the following commands in your Colab Notebook:
!git clone https://github.com/MVolobueva/MTase-classification.git
!sudo apt-get -y install hmmer
!git clone https://github.com/isrusin/etsv
!python3 -m pip install -e etsvThe visualization of the steps in the pipeline is shown in the image below:
Conduct the HMMer search to identify Methyltransferase sequences. Use the following command:
!hmmsearch --cpu 3 -E 0.01 --domE 0.01 --incE 0.01 --incdomE 0.01 \
-o /dev/null --noali -A file.stk \
/content/MTase-classification/HMM_profiles/selected_profiles.hmm /content/MTase-classification/Sample_MTases/MTase_sequences.fastaAfter running the HMMer search, the next step is to detect regions in the alignment. Use this command:
!./MTase-classification/Scripts/get_aln_regions.py \
/content/MTase-classification/profile-markup/All_profile_region.csv \
/content/file.stk > region_alignments.tsvFinally, perform the classification of the detected regions with the following command:
!python ./MTase-classification/Scripts/classification.py \
--t /content/region_alignments.tsv \
--m several_cat_domains.tsv \
--c class.tsvTo illustrate the workings of the pipeline, we have developed a web application. You can access it at the following link:
MTase Classification Web Application
For detailed information about the classification method used in this pipeline, please refer to the latest version of the manuscript available at: Classification Manuscript.
Contributions to this project are welcome! Please feel free to fork the repository and submit pull requests.
This project is licensed under the MIT License - see the LICENSE file for details.
We would like to acknowledge the developers and contributors of the libraries and tools used in this project. Special thanks to the HMMer team for their contribution to bioinformatics.
