Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/1580
Full metadata record
DC FieldValueLanguage
dc.contributor.authorDebasisa, Mohanty-
dc.contributor.authorKhanduja, Akshay-
dc.date.accessioned2025-03-25T10:24:39Z-
dc.date.available2025-03-25T10:24:39Z-
dc.date.issued2025-01-07-
dc.identifier.urihttp://hdl.handle.net/123456789/1580-
dc.description.abstractSmall proteins (≤100 amino acids) play important roles across all life forms, ranging from unicellular bacteria to higher organisms. In this study, we have developed SProtFP which is a machine learning-based method for functional annotation of prokaryotic small proteins into selected functional categories. SProtFP uses independent artificial neural networks (ANNs) trained using a combination of physicochemical descriptors for classifying small proteins into antitoxin type 2, bacteriocin, DNA-binding, metal-binding, ribosomal protein, RNA-binding, type 1 toxin and type 2 toxin proteins. We have also trained a model for identification of small open reading frame (smORF)-encoded antimicrobial peptides (AMPs). Comprehensive benchmarking of SProtFP revealed an average area under the receiver operator curve (ROC-AUC) of 0.92 during 10-fold cross-validation and an ROC-AUC of 0.94 and 0.93 on held-out balanced and imbalanced test sets. Utilizing our method to annotate bacterial isolates from the human gut microbiome, we could identify thousands of remote homologs of known small protein families and assign putative functions to uncharacterized proteins. This highlights the utility of SProtFP for large-scale functional annotation of microbiome datasets, especially in cases where sequence homology is low. SProtFP is freely available at http://www.nii.ac.in/sprotfp.html and can be combined with genome annotation tools such as ProsmORF-pred to uncover the functional repertoire of novel small proteins in bacteria.en_US
dc.language.isoenen_US
dc.publisher© The Author(s) 2025. Published by Oxford University Press on behalf of NAR Genomics and Bioinformaticsen_US
dc.titleSProtFP: a machine learning-based method for functional classification of small ORFs in prokaryotesen_US
dc.typeArticleen_US
dc.journalNAR Genom Bioinform .;en_US
dc.volumeno7en_US
dc.issueno(1)en_US
dc.pages186en_US
Appears in Collections:Bioinformatics Centre, Publications

Files in This Item:
File Description SizeFormat 
lqae186.pdf1.76 MBAdobe PDFView/Open    Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.