Hi,
I have a question about genome alignment. I am working with RNA-Seq dataset to study the impact of Liquid Culture in response to virus of different doses in Human. I was exploring what could be the good strategy or best in practice method for genome mapping.
Maybe;
Identity the reads mapping to virus and exclude it from the analysis. After this, extract the unaligned reads and map against hg38 genome and perform quantification to count the genes followed by downstream analysis in either edgeR
or Deseq2
.
Additionally, I was thinking about the below scenarios:
- Why not just do the alignment as standard against hg38 using
HISAT
orSTAR
aligner > then quantify usingRSEM
orFeatureCounts
.
(OR)
- Map reads first against virus genome in question using
Bowtie/Bowtie2
and store the unmapped reads as fastq files, then use these unmapped fastq file inHISAT
ORSTAR
to align againsthg38
> Quantify
(OR)
- I was just reading about BBMAP
BBSplit
. Use this?
Thank you very much for the help.
Toufiq
Are the viral transcripts ending up in the final dataset? If so you may want to see if there is any correlation with the initial dosing. So while you could simply split and remove viral reads, doing what ATPoint suggest may be the way to go.