Marking Duplicates And Comparing 2 Samples?
0
1
Entering edit mode
12.8 years ago
Angel ▴ 220

Hey,

Two follow-up questions for BioStar:

1) I used Picard "MarkDuplicates" to mark duplicates and called bam as "marked.bam". I used snpeff to annotate snps and indels etc. The file written using marked.bam is smaller than un-marked bam (~5000 rows smaller). So I am assuming snpeff is not taking into account PCR duplicates. Is this correct?

2) Now I want to compare snpeff result between two samples. ANy recommendations for softwares for this? I obviously will compare SNP and INDELS etc.

Thanks very much again. *[Edited]

exome statistics • 2.8k views
ADD COMMENT
1
Entering edit mode

snpeff does not call SNPs, it annotates them. Are you missing a step (calling variants) in the explanation?

ADD REPLY
0
Entering edit mode

SOrry ... thanks for correcting me. I am using vcftools to call variants after marking duplicates. I am using snpeff to annotate the variants.

ADD REPLY
0
Entering edit mode

Hey,

Since no one replied, I tried converting snpeff output to BED format and use "subtractBed" to find differences between two samples. Is it the only way/right way to do it?

The differences are many, so how would I find something that is biologically meaningful?

ADD REPLY
1
Entering edit mode

You need to rewrite your question and make sure you explain each step. I couldnt understand your question? In comments you have mentioned that you used vcftools to call SNPs. Do you mean bcftools from samtools that uses mpileup output from samtools and does the variant calling? Also, snpEff input doesn't have PCR duplication information as that information is lost by that time.

ADD REPLY

Login before adding your answer.

Traffic: 2674 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6