Entering edit mode
6.0 years ago
ashaneev07
▴
40
Hiii.... i got the following error while running picards markduplicates. Does anyone have any experience with using this command in picard? Need help..
> java -jar picard.jar MarkDuplicates I=300BP.sorted O=marked_duplicates_300.bam M=marked_dup_metrics.txt REMOVE_DUPLICATES=true &
********** NOTE: Picard's command line syntax is changing.
**********
********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
**********
********** The command line looks like this in the new syntax:
**********
********** MarkDuplicates -I /300BP.sorted -O marked_duplicates_300.bam -M marked_dup_metrics.txt -REMOVE_DUPLICATES true
**********
15:13:11.216 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/home/Documents/Tools_NGS/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Wed Nov 21 15:13:11 IST 2018] MarkDuplicates INPUT=[300BP.sorted] OUTPUT=marked_duplicates_300.bam METRICS_FILE=marked_dup_metrics.txt REMOVE_DUPLICATES=true MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 TAG_DUPLICATE_SET_MEMBERS=false REMOVE_SEQUENCING_DUPLICATES=false TAGGING_POLICY=DontTag CLEAR_DT=true ADD_PG_TAG_TO_READS=true ASSUME_SORTED=false DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates READ_NAME_REGEX=<optimized capture="" of="" last="" three="" ':'="" separated="" fields="" as="" numeric="" values=""> OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 MAX_OPTICAL_DUPLICATE_SET_SIZE=300000 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Wed Nov 21 15:13:11 IST 2018] Executing as home@home-Lenovo-H30-50 on Linux 4.4.0-31-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_171-b11; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.18.14-SNAPSHOT
INFO 2018-11-21 15:13:11 MarkDuplicates Start of doWork freeMemory: 240890984; totalMemory: 251658240; maxMemory: 3720871936
INFO 2018-11-21 15:13:11 MarkDuplicates Reading input file and constructing read end information.
INFO 2018-11-21 15:13:11 MarkDuplicates Will retain up to 13481420 data points before spilling to disk.
[Wed Nov 21 15:13:13 IST 2018] picard.sam.markduplicates.MarkDuplicates done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=1302331392
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: /tmp/home/CSPI.8946166571745516868.tmp/20922.tmpnot found
at htsjdk.samtools.util.FileAppendStreamLRUCache$Functor.makeValue(FileAppendStreamLRUCache.java:64)
at htsjdk.samtools.util.FileAppendStreamLRUCache$Functor.makeValue(FileAppendStreamLRUCache.java:49)
at htsjdk.samtools.util.ResourceLimitedMap.get(ResourceLimitedMap.java:76)
at htsjdk.samtools.CoordinateSortedPairInfoMap.getOutputStreamForSequence(CoordinateSortedPairInfoMap.java:180)
at htsjdk.samtools.CoordinateSortedPairInfoMap.put(CoordinateSortedPairInfoMap.java:164)
at picard.sam.markduplicates.util.DiskBasedReadEndsForMarkDuplicatesMap.put(DiskBasedReadEndsForMarkDuplicatesMap.java:65)
at picard.sam.markduplicates.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:543)
at picard.sam.markduplicates.MarkDuplicates.doWork(MarkDuplicates.java:232)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:295)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
Caused by: java.io.FileNotFoundException: /tmp/home/CSPI.8946166571745516868.tmp/20922.tmp (Too many open files)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at htsjdk.samtools.util.FileAppendStreamLRUCache$Functor.makeValue(FileAppendStreamLRUCache.java:61)
... 10 more
can you try with
I tried with as u mentioned and now it shows like
I=300BP.sorted.bam' is not a valid command
sorry I forgot the main jar command after the jar...
may be not the main problem but the extension of 300BP.sorted should be 'sam'.
what is the outpout of
?
300BP.sorted file is a sorted bam file.
this is not the output of the command 'file'
Could u please explain the meaning of your previous statement more fully. i didn't get any output file from the above command.
file <filename>
prints out information about the filetyp. For a validbam
file you should get the following message in your terminal:fin swimmer
ya..i got it.
$ file 300BP.sorted.bam
300BP.sorted.bam: gzip compressed data, extra field
This looks like MarkDuplicates needs to create many temporary files and also needs to keep them open. Most distribution have a limit of 1024 by default. You can check this with
ulimit -n
. For the current shell you can set it to higher number by e.g.ulimit -n 2048
.fin swimmer
Got like this...
bash: ulimit: open files: cannot modify limit: Operation not permitted
There seems to be a lot of reasons and solution why this message appears. As I don't know your system I would recommend searching the web for this error message, to find a solution to increase the limit on your system.
fin swimmer