Entering edit mode
10.3 years ago
nayshool
▴
20
Hello everyone!
We have received from BGI china a few samples we sent for exome sequence + Basic analysis. We got the fastQ files and BAM files. I have made my own BAM files using BWA and Picard tools using the following commands:
BWA:
bwa mem human_g1k_v37.fasta file1.fq.gz file2.fq.gz file3.fq.gz file4.fq.gz > result.sam
PICARD:
java -jar SortSam.jar I=result.sam O=result.bam SORT_ORDER=coordinate
The size of the files that I got (2Gb) , was about a half from the BAM files BGI (4.2 Gb) sent me.
Here are the samtools flagstat
results for both files:
my BAM file:
[gen-biorep@gen-biorep-3 VA6]$ samtools flagstat result.bam
28842836 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
28824861 + 0 mapped (99.94%:-nan%)
28842836 + 0 paired in sequencing
14423043 + 0 read1
14419793 + 0 read2
28593748 + 0 properly paired (99.14%:-nan%)
28814004 + 0 with itself and mate mapped
10857 + 0 singletons (0.04%:-nan%)
165631 + 0 with mate mapped to a different chr
131466 + 0 with mate mapped to a different chr (mapQ>=5)
BGI BAM File:
[gen-biorep@gen-biorep-3 result_alignment]$ samtools flagstat VA6.rmdup.bam
57668392 + 0 in total (QC-passed reads + QC-failed reads)
3039053 + 0 duplicates
57369744 + 0 mapped (99.48%:-nan%)
57668392 + 0 paired in sequencing
28834196 + 0 read1
28834196 + 0 read2
56991680 + 0 properly paired (98.83%:-nan%)
57273460 + 0 with itself and mate mapped
96284 + 0 singletons (0.17%:-nan%)
198072 + 0 with mate mapped to a different chr
181774 + 0 with mate mapped to a different chr (mapQ>=5
Why do half of my reads disappear? How can I solve this problem?
Thank you,
Omri
Hello nayshool!
It appears that your post has been cross-posted to another site: SeqAnswers (where I happen to have already replied).
This is typically not recommended as it runs the risk of annoying people in both communities.
i.e. Duplicate of http://seqanswers.com/forums/showthread.php?t=45597