Entering edit mode
12.8 years ago
Angel
▴
220
Hey,
Two follow-up questions for BioStar:
1) I used Picard "MarkDuplicates" to mark duplicates and called bam as "marked.bam". I used snpeff to annotate snps and indels etc. The file written using marked.bam is smaller than un-marked bam (~5000 rows smaller). So I am assuming snpeff is not taking into account PCR duplicates. Is this correct?
2) Now I want to compare snpeff result between two samples. ANy recommendations for softwares for this? I obviously will compare SNP and INDELS etc.
Thanks very much again. *[Edited]
snpeff does not call SNPs, it annotates them. Are you missing a step (calling variants) in the explanation?
SOrry ... thanks for correcting me. I am using vcftools to call variants after marking duplicates. I am using snpeff to annotate the variants.
Hey,
Since no one replied, I tried converting snpeff output to BED format and use "subtractBed" to find differences between two samples. Is it the only way/right way to do it?
The differences are many, so how would I find something that is biologically meaningful?
You need to rewrite your question and make sure you explain each step. I couldnt understand your question? In comments you have mentioned that you used vcftools to call SNPs. Do you mean bcftools from samtools that uses mpileup output from samtools and does the variant calling? Also, snpEff input doesn't have PCR duplication information as that information is lost by that time.