Problem Analyzing Tumor-Normal Pairs With Varscan
1
1
Entering edit mode
12.3 years ago
tommivat ▴ 250

I am using Varscan 2.2.11 to analyze a tumor-normal pair of whole exome NGS data. I used 1000genomes reference genome (from here) and my .bam files are sorted. I use shell script similar to this to call VarScan and get the following summary after my run:

2 015 987 104 positions in tumor
2 015 519 315 positions shared in normal
   90 913 571 had sufficient coverage for comparison
            0 were called Reference
          496 were mixed SNP-indel calls and filtered
   90 834 408 were called Germline
            0 were called LOH
            0 were called Somatic
        78667 were called Unknown
            0 were called Variant

Obviously, there are some problems, since almost all positions with sufficient coverage are called Germline and none are called Reference. Can you point out what am I doing wrong, please.

varscan cancer samtools next-gen • 4.0k views
ADD COMMENT
2
Entering edit mode

I suspect that the reference used in the alignment (which I haven't done myself) and paired analysis has to be same. In this case, I do not (yet) know which reference was used in alignment. Can you confirm if this can create the problem above.

ADD REPLY
2
Entering edit mode

The header of your bam file should contain information about which reference was used for alignment.

ADD REPLY
2
Entering edit mode
12.3 years ago

You most likely need to realign one or both of your samples to the same reference. To confirm that the BAM files are actually using different references:

samtools view -H normal.bam | egrep "@SQ" > normal
samtools view -H tumor.bam | egrep "@SQ" > tumor
diff -y normal tumor

The above is just a simple way of grabbing the sequence dictionary from the header and comparing. If there are any differences between your files, then you will need to align them both to the same reference.

ADD COMMENT
0
Entering edit mode

Thank you for the answer! There are no differences in headers. Do you still think I need to realign both samples to my current reference? How is such a realignment done? (I realize this must be a simple task but I haven't works with samtools too much.)

ADD REPLY
1
Entering edit mode

I think that you are probably using a reference genome sequence that does not match what you two BAM files were aligned to. Try tracking down the reference that your BAM files were aligned to and specifying that as the reference for VarScan2.

ADD REPLY
0
Entering edit mode

This solved the problem. Thanks for help!

ADD REPLY

Login before adding your answer.

Traffic: 2037 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6