Entering edit mode
19 months ago
mathalfilip
•
0
I am trying to run DownsampleSam with Picard version: 2.26.5 on the following script and i get an error about Provider GCS.
code:
import os,sys
from multiprocessing import Pool
# the original depth of NA12878 and YH-1
nadepth=812.40
yhdepth=407.25
work_dph=int(sys.argv[1])
napercent = [0.99,0.95,0.90,0.80,0.70,0.60]
yhpercent = [0.01,0.05,0.10,0.20,0.30,0.40]
#yhpercent = [0.05]
NAtotal='/media/marina/marina2TB/BIOFILES/bams/na.addRG.mdup.bam'
YHtotal='/media/marina/marina2TB/BIOFILES/bams/na.addRG.mdup.bam'
downsample="java -Xmx3000m -Djava.io.tmpdir=/media/marina/marina2TB/BIOFILES/tmp -jar /home/marina/opt/picard.jar DownsampleSam "
# Important! Random seed must NOT be changed! Or you won't get the same bam file!
seeds = {1:6666,2:8888,3:9999}
def extract(script):
os.system(script)
p=Pool(18)
if sys.argv[2]=='NA':
for i in napercent:
na_dph = int(work_dph*i)
p_na = na_dph/nadepth
for a in seeds:
rand_seed = seeds[a]
script = "%s I=%s O=NA-%sX-%s.bam R=%s P=%s A=0.00000001"%(downsample,NAtotal,na_dph,a,rand_seed,p_na)
p.apply_async(extract,args=(script,))
elif sys.argv[2]=='YH':
for i in yhpercent:
yh_dph = int(work_dph*i)
p_yh = yh_dph/yhdepth
for a in seeds:
rand_seed = seeds[a]
script = "%s I=%s O=YH-%sX-%s.bam R=%s P=%s A=0.00000001"%(downsample,YHtotal,yh_dph,a,rand_seed,p_yh)
p.apply_async(extract,args=(script,))
p.close()
p.join()
Error:
Provider GCS is not available; Picard version: 2.26.5
1) I think this is just a warning 2) there must be another error displayed 3) look at samtools view --subsample 4) don't use loops but use a workflow manager.