Extraction of miRNA Sequences from Infernal cmscan Output Using Python
1
0
Entering edit mode
3 months ago
Vijith ▴ 90

I have RNA data from running the cmscan program of the Infernal library.

The cmscan output is very detailed, consisting of all the different families of RNA, their sequences, coordinates, secondary structure etc. This is approximately a 6 Gb data. Now, I want to extract the miRNA sequence alone and go for predicting the potential targets.

The challenge is, I don't have a suitable automated way to extract the miRNA sequences from the cmscan output.

Infernal doesn't have an appropriate utility to help with this.

This is a screenshot that shows snippet of the cmscan output.

I can script using Python, and I need a head-start, like a pattern/ guideline to follow to extract the miRNA sequence sequence alone

infernal assembly rna genome mirna • 318 views
ADD COMMENT
2
Entering edit mode
3 months ago

Hi;

infernal docs describe the format while stating that it is not ment to be parsed. The infernal programs offers multiple parse friendly output options (e.g. ment for post-processing).

Check the --tblout option and more general the documentation http://eddylab.org/infernal/Userguide.pdf. (The sequence will not there, but it will list the id and the match start and end; than it's easy job to get the sequence with e.g. Biopython form the scanned sequence database)

ADD COMMENT
1
Entering edit mode

Okay, I will follow that approach. Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 2575 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6