Entering edit mode
10.6 years ago
int11ap1
▴
490
Hello guys,
I am doing a project in RNA-seq data with some datasets that are paired-end data. After alignment with TopHat2, I have around 1-5% of reads whose mate is mapped to another chromosome. In order to call Cufflinks for annotation of transcripts, would you recommend me to remove those reads mapped to another chromosome (or call again TopHat2 with the option --no-discordant)? Is it a key step?
Thanks in advanced.
Sorry this isn't answering your question but instead asking a new one: What tool (or commands) did you use to find out how many reads had mates that mapped to a different chromosome?
samtools flagstat FILE.bam
I thought bowtie rejected alignments of reads whose mates are too far from one another (than the estimated fragment size). Maybe tophat retains them for splicing event detection?
I don't think TopHat2 will use read pairs aligning on different chromosomes for annotating transcripts. This is my personnel guess but I am quiet positive about it. So you need not to remove these reads. Still if you want you can use this command:
samtools view -H Your.bam > Your.sam
samtools view Your.bam | awk '$7 == "=" {print}' >> Your.sam
samtools view -bS Your.sam > Your_new.bam