Entering edit mode
6.7 years ago
Arindam Ghosh
▴
530
I have a set of differentially expressed genes after analysis of RNAseq data. How can I sort them into protein coding genes and miRNA, ncRNA? I have used HISAT2-STRINGTIE-BALLGOWN for analysis. Reference genome: GRCh38 with annotation from Ensembl release 84.
Translate + blastp or blastx against a proteome. Stuff that does match: proteins, stuff that does not match: either novel proteins (check for ORFs) or ncRNA. The ncRNA candidates you could check against Rfam, e.g. with Infernal.
That will be a tedious task as I have ~2000 of GENE names and Ids.
I am not talking about using blast manually via web interface ;P