Somatic indel calling with Pindel
1
1
Entering edit mode
9.4 years ago
sichan ▴ 90

Hello,

I noticed that Pindel was previously reported to not be able to call somatic indels without post-processing, e.g. see these posts:

However, I noticed that the latest version of Pindel (0.2.5b1) has the --NormalSamples option. According to the usage prompt, this flag turns on germline variant filtering:

-N/--NormalSamples
Turn on germline filtering, less sensistive and you may miss somatic calls (default false)

I run Pindel as follows:

pindel -f hg19.fa -i bam_config_file.txt -o Test1 --NormalSamples

My bam configuration file looks like:

/path/to/tumor/bam insert_size Tumor
/path/to/normal/bam insert_size Normal

However, I noticed that the output still contains germline indels. So perhaps the --NormalSamples hasn't been fully implemented yet and I'll still need to post-process results to get somatic indels from Pindel?

Thanks for any suggestions.

pindel somatic-indels • 4.4k views
ADD COMMENT
3
Entering edit mode
5.5 years ago

A late answer but I am currently using Pindel.

Current versions (now June 8th, 2019) do perform somatic indel calling.

Here is my config file:

BB40.bam        150 Tumor
BM_Control.bam  150 Normal

Here are my commands to loop through each chromosome (mouse; excluding sex chromosomes):

mkdir pindel/ ;

for chr in {1..19}; do
  mkdir pindel/chr"${chr}"/

  /Programs/pindel/pindel \
    --fasta /ReferenceMaterial/mm10/mm10.fasta \
    --config-file pindelConfigs/BM39.txt \
    --chromosome chr"${chr}" \
    --number_of_threads 2 \
    --min_num_matched_bases 30 \
    --report_inversions TRUE \
    --min_inversion_size 50 \
    --report_duplications TRUE \
    --report_long_insertions TRUE \
    --report_breakpoints TRUE \
    -o pindel/chr"${chr}"/ ;

  /Programs/pindel/pindel2vcf \
    --pindel_output_root pindel/chr"${chr}"/ \
    --reference /ReferenceMaterial/mm10/mm10.fasta \
    --reference_name mm10 \
    --reference_date 99999999 \
    --min_coverage 1 \
    --het_cutoff 0.01 \
    --hom_cutoff 0.99 \
    --vcf pindel/chr"${chr}"/chr"${chr}".vcf ;

  bgzip -f pindel/chr"${chr}"/chr"${chr}".vcf ;

  tabix -f -p vcf pindel/chr"${chr}"/chr"${chr}".vcf.gz ;

  /Programs/bcftools-1.9/bcftools view -Ov --exclude-uncalled --min-ac=1 pindel/chr"${chr}"/chr"${chr}".vcf.gz > pindel/chr"${chr}"/chr"${chr}".filt.vcf ;
done

The resulting VCFs contain the calls for the tumor and normal sample, and indicate presence or absence in both.

Kevin

ADD COMMENT
1
Entering edit mode

thanks for this, exactly what I was looking for

ADD REPLY
0
Entering edit mode

What if I have multiple pairs of data? Do I need to generate separate config file for each pair?

Regards, Najeeb

ADD REPLY
0
Entering edit mode

Do I need to generate separate config file for each pair?

If you want separate somatic calls, I believe 'yes'.

I am not 100% confident, though, and pindel functionality changes depending on the version that you are using. Please review all command line options for the version that you are using, and [if possible], perform some tests with a few BAM files.

ADD REPLY

Login before adding your answer.

Traffic: 1770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6