PLNCPRO: Plant Long Non-Coding RNA Prediction by Random fOrests, JNU

Developed by JNU

Long non-coding RNAs (lncRNAs) make up a significant portion of non-coding RNAs and are involved in a variety of biological processes. The long non-coding RNAs (lncRNAs) do not code for proteins and have minimum transcript length of 200 bp. Accurate identification/annotation of lncRNAs is the primary step for gaining deeper insights into their functions. Next-generation RNA-sequencing methods give us an opportunity to study the whole transcriptome of any organism and these data can be used to identify potential lncRNAs. We have developed a novel tool, PLncPRO, for the prediction of lncRNAs in plants using transcriptome data. PLncPRO is based on machine learning and uses a random forest algorithm via constructing a training model based on 71 features to classify the coding and long non-coding transcripts. PLncPRO has very high prediction accuracy and is particularly well-suited for plants. The performance of PLncPRO was quite better with vertebrate transcriptome data as well. We demonstrated its utility by identifying novel lncRNAs in rice and chickpeas. The availability of plant-specific lncRNA prediction tool will provide a useful resource for the discovery of lncRNAs and understanding their role in plants.

Scroll to Top