Hello,
I would like to create a kallisto index that contains coding and noncoding RNA but also exclude the ribosomal genes. I know I need the cdna and ncrna fasta files from the ensembl website initially. However, I am not sure how to remove the ribosomal genes from the reference. Is there an easy way to remove the ribosomal genes?
Thank you
Did you want to remove both rRNA and ribosomal protein coding genes? Also, why did you want to remove these in the first place? It may cause some of these reads to be assigned to incorrect transcripts.
I would like to remove only the rRNA. We found out that there was an rRNA contamination from a probe we used so we would like to remove those transcripts from our data.
Then you pick out the rRNA genes after aligning to everything. Removing things you know are present from the reference could make the alignment inaccurate.