Extracting reads belonging to different RNA families from Rfam
0
1
Entering edit mode
6.5 years ago
Tm ★ 1.1k

I have Illumina data of 1 x50bp from small RNA library. I am interested in identifying known and novel miRNAs in my plant, whose reference genome is not available. Here first I want to count the reads aligning to non-coding rRNAs like tRNA, rRNA, snoRNA, snRNA, siRNA etc except miRNAs. After removing those reads, I would like to align remaining reads with miRBase using blastn or bowtie to identify known miRNAs. Now problem is that, I want to use Rfam for detecting reads of different RNA class, thus I concatenated the fasta files of 2686 families (2,487,655 sequences) but Rfam v13 sequences does not have proper information of the class they represent in their header. So my concern is how can I count and extract the reads mapping to different classes of RNAs using Rfam.

Rfam miRNA analysis classification of RNAs • 2.1k views
ADD COMMENT
2
Entering edit mode

helloo @toralmanvar

i dont know about the rfam issue but be careful while choosing blastn in your case. Blast is a heuristics and originally developed for quick database searche. it is not guaranteed to find all occurrences of a short sequence like miRNA; it will miss ~40% of possible hits when dealing with sequences of 20bp. also, it produce local alignments rather than end-to-end global matches. so i will not recommended. but here are few suggestionps

  • you need to reduce the word size to about half the query length
  • increase the E-value to see more hits with low p-value because short sequences are more likely to occur by chance. why? a short query is more likely to occur by chance in the database. Therefore, even a perfect match can have low statistical significance and may not be reported. Increasing the E value allows you to look farther down in the hit list and see matches that would normally be discarded because of low statistical significance (see this link).

alos, read this paper

ADD REPLY
0
Entering edit mode

Actually I want to classify the ncRNAs like the table shown here.

ADD REPLY

Login before adding your answer.

Traffic: 1832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6