Hello, In my project I am going to compare bioinformatic tools for tumor clonary evolution analysis (sciClone and others). I have results of paired-end Whole Exom Sequencing, three samples from one patient: Control, Primary and Relapse tumor. Therefore, first I have to detect somatic mutations and I am going to do that with VarScan. I don't quite understand though what processing pipeline should I choose from the step of creating .mpileup files with Samtools.
I've found some possibilieties, which includes:
-creating .mpileup files separately for normal and tumor samples, and then make both of them an input to VarScan following (in short):
samtools mpileup -f hg19.fa nomal.bam > normal.bam.mpileup
samtools mpileup -f hg19.fa tumor.bam > tumor.bam.mpileup
java -jar VarScan.jar somatic normal.bam.mpileup tumor.bam.mpileup --output-snp snp --output-indel indel
(source: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3971343/)
-creating .mpileup files for paired normal-tumor samples, and then give the result as the input to VarScan:
samtools mpileup –f reference.fasta normal.bam tumor.bam > normal-tumor.mpileup
java –jar VarScan.jar somatic normal-tumor.mpileup output.basename
(source: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4278659/)
My problem is, that I don't quite understand which way should I choose - could someone please explain me what is the difference in those schemes? I am completely freshman in NGS data analysis, therefore the simplest words, the better :) And just to be sure - can I treat my "Control" sample as normal sample?
I would appreciate any help
Kind regards, Agata
So does the term "-mpileup" instruct it to take the output of the mpileup before the pipe and use it at that specific place in the varscan command? I'm confused because I'm reading in other places that you can just use a single "-" when piping with SAMtools and it will know to take it as the output of the previous command. Would both "-" and "-mpileup" work in this case?
where does the argument for the output file go?