Entering edit mode
4.2 years ago
jkkbuddika
▴
200
Hi,
I am in the process of analyzing a collection of small RNA datasets from flies (Drosophila melanogaster). So far this is what I have done:
- Trim adaptors using cutadapt and extract 18-30nt long sequences.
- Align to the Drosophila genome using Bowtie (not Bowtie2): Reads with at least one alignment 80-85%
- Count miRNAs using featureCounts using miRBase annotations.
- I also use ShortStack to identify potential novel small RNAs.
Now I want to identify piRNAs and siRNAs from my data. I see there are a couple of piRNA databases with piRNA data (like piRNAdb and piRBase). Can I use a GTF from one of these databases to identify piRNAs? How about siRNAs?
Thanks,
KB
An important point here is to make sure you do not loose reads with multi mapping as many piRNAs are similar sequences.