Please use this identifier to cite or link to this item:
http://hdl.handle.net/123456789/1580
Title: | SProtFP: a machine learning-based method for functional classification of small ORFs in prokaryotes |
Authors: | Debasisa, Mohanty Khanduja, Akshay |
Issue Date: | 7-Jan-2025 |
Publisher: | © The Author(s) 2025. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics |
Abstract: | Small proteins (≤100 amino acids) play important roles across all life forms, ranging from unicellular bacteria to higher organisms. In this study, we have developed SProtFP which is a machine learning-based method for functional annotation of prokaryotic small proteins into selected functional categories. SProtFP uses independent artificial neural networks (ANNs) trained using a combination of physicochemical descriptors for classifying small proteins into antitoxin type 2, bacteriocin, DNA-binding, metal-binding, ribosomal protein, RNA-binding, type 1 toxin and type 2 toxin proteins. We have also trained a model for identification of small open reading frame (smORF)-encoded antimicrobial peptides (AMPs). Comprehensive benchmarking of SProtFP revealed an average area under the receiver operator curve (ROC-AUC) of 0.92 during 10-fold cross-validation and an ROC-AUC of 0.94 and 0.93 on held-out balanced and imbalanced test sets. Utilizing our method to annotate bacterial isolates from the human gut microbiome, we could identify thousands of remote homologs of known small protein families and assign putative functions to uncharacterized proteins. This highlights the utility of SProtFP for large-scale functional annotation of microbiome datasets, especially in cases where sequence homology is low. SProtFP is freely available at http://www.nii.ac.in/sprotfp.html and can be combined with genome annotation tools such as ProsmORF-pred to uncover the functional repertoire of novel small proteins in bacteria. |
URI: | http://hdl.handle.net/123456789/1580 |
Appears in Collections: | Bioinformatics Centre, Publications |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
lqae186.pdf | 1.76 MB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.