Variation In The Unique Reads Stats
0
0
Entering edit mode
11.1 years ago
alok.helix ▴ 120

Hello thank you for reading the post!!

I am trying to comapre the statistics of mapping of reads to the reference genome. In my tophat based alignment i used the tags grep -w "NH:i:1" and grep -w "HI:i:1" both gave me the same answer. I also made use of a ready made python scriptBAMstat.py) from the toolkit of RNAseqQC for calculating the number of unique reads this also gave me the same answer as my command line command.

On performing BWA alignment with the 0.6.2-r126 by using aln, sampe i generated my sam and bam file from the paired end illumina data. I searched the unique reads by utilizing the code grep -w "XT:A:U" and grep -w "X1:i:0" to search for the number of unique reads both gave me the answer with a variation of about 5 million reads with "X1:i:0" giving a higher number.

Upon using the BAMstat.py script i got a significantally lower uniquely aligned reads in comparision to tophat unique reads...Why is there so much variation in the read stats??

genomics illumina bwa tophat alignment bowtie2 • 2.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 2879 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6