Question

large bam file and get small mpileup file

0

Entering edit mode

9.6 years ago

bingnas ▴ 10

Hello all

I have file.bam around 4.6 GB and converted it to file.pileup around 8.3 GB by samtools:

samtools mpileup -B -f genome.fa file.bam > file.pileup

then I used VarScan to call variants like:

java -Xmx2g -jar $VARSCAN_DIR/VarScan.v2.3.7.jar mpileup2snp file.pileup --min-coverage 10 --min-base-qual 30 --output-vcf 1 > sample1.vcf

But the sample.vcf is too small which is 11,450 KB

So anyone know how I can make sure that the bam file is acceptable to get pileup file?

and also how I can know that pileup file is good input in VarScan?

Thank you in advance for your help

SNP • 3.4k views

ADD COMMENT • link updated 23 months ago by Ram 44k • written 9.6 years ago by bingnas ▴ 10

0

Entering edit mode

Now I am satisfied about the result and thank you so much both

ADD REPLY • link 9.6 years ago by bingnas ▴ 10

0

Entering edit mode

could also be that most bases are below the filter (--min-coverage 10 and Fred score 30) as pointed by Devon above...just guessing from filters. Did you run QC?

ADD REPLY • link 7.6 years ago by cpad0112 21k

Ram · Accepted Answer · 2015-05-12

1

Entering edit mode

9.6 years ago

Ashutosh Pandey 12k

The file sizes look reasonable to me. The size of the pileup file is typically significantly bigger than the BAM file, and the size of the vcf file is way smaller than the both bam and the pileup file.I don't see any evident problem here. Perhaps you are using too stringent threshold of 10 reads to call SNPs and as a result not getting many variants.

ADD COMMENT • link updated 23 months ago by Ram 44k • written 9.6 years ago by Ashutosh Pandey 12k

0

Entering edit mode

I agree completely. I'll add that another possibility is that there simply aren't many variants versus the reference in the sample being looked at. The simplest way to determine all of this is just look through the data a bit.

ADD REPLY • link updated 23 months ago by Ram 44k • written 9.6 years ago by Devon Ryan 105k