We have RNA-Seq sequences in bam format and would like to perform gene expression analysis for specific genes. The most important step for us is to prepare count matrices.
Questions…
- Is Bioconductor’s rnaseqGene the best tool to use for gene expression with RNA-Seq sequences? It appears to encompass all steps?
- Is gene expression analysis for RNA-seq data with DESeq2 a better tool to just obtain counts?
- After the gene expression analysis is complete, which tools are best to perform survival analysis using the counts ?
Thank you.
The best thing to do is try to either obtain raw FASTQs (from which the BAMs were generated) or the raw counts (generated from the BAMs). If you are starting with BAMs, you are likely to encounter errors because you will need to obtain a gene annotation file (usually GTF) that is compatible with your BAM.
Then, you can check some other discussions like this one: Can anyone suggest a good tutorial to learn RNA-seq analysis?
What is your starting data?, i.e., what data do you have right now? That will dictate the program ('tool') that you eventually use.
New programs are released almost daily, so, the field is flooded with a diverse range of programs to use.
We currently have rna-seq sequences in bam format.
Great! How were they produced? I am imagining that you will say TopHat, Tophat2, or HISAT2
The sequences were provided to us in .bam format.
Okay, but it is actually of high importance to understand how the BAMs were produced. That is, which alignment program was used, and how it was used.