Question

Does angsd software need a lot of memory when calling SNP for a large number of samples?

1

Entering edit mode

4.0 years ago

Daier ▴ 20

I have a dataset of 130 WGR samples but it keeps getting ‘Killed’ in the middle of SNP calling. A similar issue previously posted suggested a problem with either the code or the bam files which I then investigate. I have reduced the sample size to 2 samples and reran the code. The process finished successfully. I then investigate the issues with the bam files. I have run multiple set of 50 with different samples and they all successfully finished. The ‘Killed’ error appeared again when I increase the sample size to 75.

Here is my code and an example of the error:

angsd -doMaf 2 -GL 1  
-doMajorMinor 1  -bam bam.filelist 
-out SNPcalling_Trial1
-SNP_pval 1e-6 -minMapQ 30 -minQ 30
-uniqueOnly 1 -remove_bads 1 -only_proper_pairs 1
-minMaf 0.05 -minInd 120 -setMinDepthInd 5
-geno_mindepth 5

I get the result that the run is killed

-> angsd version: 0.933 (htslib: 1.9) build(May 6 2020 21:25:11)
-> NB: you are doing qscore distribution, set -minQ to zero if you want to count qscores below: 30
-> SNP-filter using a pvalue: 1.000000e-06 correspond to 23.928127 likelihood units
[bammer_main] 130 samples in 135 input files
-> Parsing 130 number of samples
-> Allocated ~ 10 million nodes to the nodepool, this is not an estimate of the memory usage
 --
-> Allocated ~ 190 million nodes to the nodepool, this is not an estimate of the memory usage
-> Printing at chr: scaffold_81380 pos:134963 chunknumber 1840400 contains 461 sites are killed

I have running on a linux with a ulimit -n of 1024. Is it a memory issue? Please advise on how to proceed. Thanks!

SNP • 1.2k views

ADD COMMENT • link updated 2.3 years ago by Alice • 0 • written 4.0 years ago by Daier ▴ 20

0

Entering edit mode

Hi, I'm facing the same problem with ANGSD and I have just 32 samples. Have you find any solution? Thanks!

ADD REPLY • link 2.3 years ago by Alice • 0