filtering pileup file with samtools for SomaticSniper
2
0
Entering edit mode
9.2 years ago
TriS ★ 4.7k

Hi all

I'm working with matched tumor-normal samples and I want to call somatic variants using Somatic Sniper.

So far I got the main function working with

bam-somaticsniper -Q 40 -G -L -f genome/hg19.fa Results/kayrotypicT.bam Results/kayrotypicN.bam SomSniperResults.txt

and now I need to filter the results.

I'm following what the SomaticSniper guide says and running the pipeline following this guide on Synapse.

first question is:

The SomaticSniper guide says explicitly NOT to use mpileup,

You will also need to generate a samtools pileup (not mpileup) indel file.

So, an old post explains the difference..while I think samtools replaced pileup with mpileup, so, since mpileup should output pileup still, is mpileup fine then?

Since the example uses a -vcf output I'm creating the pileup and doing the filtering in the following way

# create pileup file 
samtools mpileup -A -B -vuf genomes/hg19.fa Results/kayrotypic.bam > SomSniperResults/var.raw.pileup

A snapshot of my output looks like this:

##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  RS-01926692_JC590_RS-01925823
chr1    10008   .       A       <X>     0       .       DP=1;I16=0,1,0,0,33,1089,0,0,0,0,0,0,0,0,0,0;QS=1,0;MQ0F=1      PL      0,3,4
chr1    10009   .       A       <X>     0       .       DP=1;I16=0,1,0,0,30,900,0,0,0,0,0,0,1,1,0,0;QS=1,0;MQ0F=1       PL      0,3,4
chr1    10010   .       C       <X>     0       .       DP=1;I16=0,1,0,0,33,1089,0,0,0,0,0,0,2,4,0,0;QS=1,0;MQ0F=1      PL      0,3,4

However, when I call

samtools.pl varFilter test.pileup | awk '$6>=20'

I get the following error:

Argument "DP=1;I16=0,1,0,0,33,1089,0,0,0,0,0,0,0,0,0,0;QS=1,0;MQ0F=1" isn't numeric in numeric lt (<) at samtools.pl line 117, <> line .

which comes from the fact that the 7th column should be numeric while what I have is the info from the vcf file.

I also tried not to output a vcf file and tried to use bcftools as:

samtools mpileup -A -B -guf /genomes/hg19.fa chr20_bam.bam > test.pileup
bcftools/bcftools view -Hgc test.pileup

[bcf_sync] incorrect number of fields (0 != 5) at 0:0

I looked at the post here but it didn't solve my problem since I'm not piping and I am testing only one small chr (chr20) from the bam file

So I'm not totally sure how to get it to work.

Thanks!

-- txt was edited with some updated info --

pileup SNP SomaticSniper samtools • 4.6k views
ADD COMMENT
1
Entering edit mode
9.1 years ago
TriS ★ 4.7k

alright, not sure ppl will have the same problem but here's what I think can be the solution:

1) I had IT install the newer samtools 1.2 + bcftools 1.2 on the server

2) using the following command line:

samtools mpileup -A -B -uf genomes/hg19.fa Results_Norm/norm.bam | bcftools call -c | vcfutils.pl varFilter -Q 20 | awk 'NR > 55 {print}'> indel_pileup_Norm.pileup
samtools mpileup -A -B -uf genomes/hg19.fa Results_Tum/Tum.bam | bcftools call -c | vcfutils.pl varFilter -Q 20 | awk 'NR > 55 {print}' > indel_pileup_Tum.pileup

I was able to get (hopefully) the correct files for normal and tumor samples to then use in SomaticSniper

ADD COMMENT
0
Entering edit mode

It works well. Thanks.

ADD REPLY
0
Entering edit mode
8.0 years ago
guangxujin • 0

It works well. Thanks.

ADD COMMENT

Login before adding your answer.

Traffic: 2734 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6