I've been given a set of three BAM (father, mother, child) and I expect the child to contains some de-novo heterozygous variations . samtools mpileup have been used to find the small variations but waht would be your protocol to extract the larger de-novo indels from the child ?
How large are the indels you're looking for? If they're bigger than read size, you'll need a structural variation detector: Hydra, pindel, Genome STRiP
I would 1) combine those bam files into a bamlist, and 2) walk the bamlist with GATK, this produces a combined vcf file with alleles called in proband and parents. Then 3) annotate that vcf with seattleseq, using the vcf (indels only) option -> and you should get an annotated vcf that you can then parse out, pulling out indels present only in the proband and not in the parents. If you have SNP array data on the family, that can be used as validation.
Oops, sorry Pierre, I did not spot your reply. I assume that means 30x depth overall, so 15 per chromosome/allele/haplotype.
If you want larger indels, then I think assembly is the only reliable way to call them (you will have to adjust for bias here,
as I write/maintain a variant assembler). You should have plenty of coverage to do it in this situation. There are a few options
my one - Cortex - dump the reads as fastq, and then pass it to Cortex. You will get better results I think if you do the whole genome,
as you avoid errors in the mapper, but that will involve ~100Gb of RAM. You can jojntly assemble the entire trio and directly compare their genomes.
Jared Simpson's SGA assembler also calls variants
Heng Li's fermi assembler can also call variants - recently published in Bioinformatics I think.
I only have experience of option 1 I'm afraid, but I'm sure I, Jared and Heng would be happy to field further questions about details.
How large are the indels you're looking for? If they're bigger than read size, you'll need a structural variation detector: Hydra, pindel, Genome STRiP
Or an assembler. Cortex, SGA, Fermi.