I have been asked to do differential and pathway analysis on some mouse RNAseq data. I was provided with FASTQs, BAMs, and gene/transcript report tables from rsem. The reads are 150 nt, paired-end from a MiSeq. rsem was run against GRCm38 with Ensembl 74 annotations. I'd like to reuse as much of the existing work as possible.
I have built some interactive volcano plots with ggvis and Shiny, and I'm starting to build a JBrowse site.
A decent fraction (10%?) of the genes with differential expression [Deseq2 adj p < 0.05, abs(log2fc) > 1] are minimally annotated gene models like "GmNNN" where the Ns are digits.
Some of those gene models have richer annotations in Ensembl 94, and some have been retired by 94. I can use BLAST to manually show that some of the reads aligning to some of these models also align to other well-characterized genes, like protein or RNA components of ribosomes.
Are there well-documented, web-based ways to automatically show other loci that the reads could have aligned to?
OK, thanks. A colleague at Penn also said that starting over with new mapping and counting may be the best choice.
I haven't tried kallisto, but maybe this is just the motivation I need. But even if that is fast, I have found running the raw FASTQ through trimmomatic slow. (I think there's a lot of adapter read-through.) I probably don't have the best ideas about what parameters to set, or how to interpret with the quality of the results.ether