Hello,
I am currently trying to remove ncRNA from sequencing using bowtie
which requires reference sequencing. As my sequencing data from mus musculus, I found the following data sources:
- noncode.org which claimed has one
NONCODEv6_mouse.fa.gz
for noncoding sequencing. However, Chrome suggested it was from a non-credential website and suggested not downloading it. - RNA central, by selecting ncRNA and Mus musculus, 188,951 sequences were filtered out, Should I use all of them? The results also included the AI-generated summaries, should I include them?
- Are there any databases people normally use in order to filter out the rRNA, tRNA etc. before mapping in the bioinformatics pipeline?
I appreciate your help.
What is this experiment about? You are only referring to removing things but perhaps nothing may need to be. After alignments to the genome you can only count what you need from the alignments.
It's about polysome profiling. I did see some debate about whether removing such contaminants is necessary, but I am unsure in which circumstance it's necessary. If it's necessary, do you have any suggested databases?
Data from RNAcentral should be comprehensive and includes noncode.
Hello, Max. Do you mean that removing nc is not necessary for all RNA sequencing?
Hello,
For removing ncRNA from your reads, you can try mapping to mouse ncRNA sequences which is available in ensembl database [https://ftp.ensembl.org/pub/release-112/fasta/mus_musculus/]
Try to align and exclude mapped reads from the alignment and proceed your analysis.
hope it helps!
cheers,
That's very helpful. Thank you, Jeevii.