Revert bam to fastq
0
0
Entering edit mode
4.1 years ago
MAPK ★ 2.1k

I am trying to revert bam to fastq. I am using docker to do this. My docker image was running fine for samples from other projects, but while working on this new project I am getting these errors. Can someone please help me resolve the problem I am having here. Thanks for your help in advance.

The command I am executing is this:

# RevertSam
if [ ! -z "${TIMING}" ]; then TIMING=(/usr/bin/time -v); fi

JAVAOPTS="-Xms2g -Xmx${MEM}g -XX:+UseSerialGC -Dpicard.useLegacyParser=false"
CUR_STEP="RevertSam"
start=$(${DATE}); echo "[$(display_date ${start})] ${CUR_STEP} starting"
"${TIMING[@]}" /usr/bin/java ${JAVAOPTS} -jar "${PICARD}" \
  "${CUR_STEP}" \
  -I "${BAMFILE}" \
  -O /dev/stdout \
  -SORT_ORDER queryname \
  -COMPRESSION_LEVEL 0 \
  -VALIDATION_STRINGENCY SILENT \
  | /usr/bin/java ${JAVAOPTS} -jar "${PICARD}" \
      SamToFastq \
      -I /dev/stdin \
      -OUTPUT_PER_RG true \
      -RG_TAG ID \
      -OUTPUT_DIR "${OUT_DIR}" \
      -VALIDATION_STRINGENCY SILENT

This is the error I am getting:

INFO    2020-10-02 19:13:52     RevertSam       Reverted 1,116,000,000 records.  Elapsed time: 02:40:02s.  Time for last 1,000,000:    7s.  Last read position: */*
INFO    2020-10-02 19:13:59     RevertSam       Reverted 1,117,000,000 records.  Elapsed time: 02:40:09s.  Time for last 1,000,000:    6s.  Last read position: */*
INFO    2020-10-02 19:14:05     RevertSam       Reverted 1,118,000,000 records.  Elapsed time: 02:40:15s.  Time for last 1,000,000:    6s.  Last read position: */*
[Fri Oct 02 19:14:21 CDT 2020] picard.sam.SamToFastq done. Elapsed time: 160.53 minutes.
Runtime.totalMemory()=2075918336
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" java.lang.NullPointerException
        at picard.sam.SamToFastq$FastqWriters.access$300(SamToFastq.java:500)
        at picard.sam.SamToFastq.handleRecord(SamToFastq.java:314)
        at picard.sam.SamToFastq.doWork(SamToFastq.java:206)
        at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
        at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
        at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
[Fri Oct 02 19:14:21 CDT 2020] picard.sam.RevertSam done. Elapsed time: 160.53 minutes.
Runtime.totalMemory()=2076049408
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.util.RuntimeIOException: Write error; BinaryCodec in writemode; streamed file (filename not available)
        at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:222)
        at htsjdk.samtools.util.BinaryCodec.writeByteBuffer(BinaryCodec.java:188)
        at htsjdk.samtools.util.BinaryCodec.writeByte(BinaryCodec.java:199)
        at htsjdk.samtools.util.BinaryCodec.writeByte(BinaryCodec.java:203)
        at htsjdk.samtools.util.BlockCompressedOutputStream.writeGzipBlock(BlockCompressedOutputStream.java:434)
        at htsjdk.samtools.util.BlockCompressedOutputStream.deflateBlock(BlockCompressedOutputStream.java:409)
        at htsjdk.samtools.util.BlockCompressedOutputStream.write(BlockCompressedOutputStream.java:305)
        at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:220)
        at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:212)
        at htsjdk.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:168)
        at htsjdk.samtools.BAMFileWriter.writeAlignment(BAMFileWriter.java:144)
        at htsjdk.samtools.SAMFileWriterImpl.close(SAMFileWriterImpl.java:210)
        at picard.sam.RevertSam$RevertSamWriter.close(RevertSam.java:685)
        at picard.sam.RevertSam.doWork(RevertSam.java:318)
        at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
        at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
        at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
Caused by: java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:65)
        at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211)
        at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
        at java.nio.channels.Channels.writeFully(Channels.java:101)
        at java.nio.channels.Channels.access$000(Channels.java:61)
        at java.nio.channels.Channels$1.write(Channels.java:174)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
        at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:220)
        ... 16 more
Command exited with non-zero status 1
        Command being timed: "/usr/bin/java -Xms2g -Xmx6g -XX:+UseSerialGC -Dpicard.useLegacyParser=false -jar /usr/bin/picard.jar RevertSam -I /RAW/WGS/test^LP6005117-DNA_G04^test_WGS/LP6005117-DNA_G04.bam -O /dev/stdout -SORT_ORDER queryname -COMPRESSION_LEVEL 0 -VALIDATION_STRINGENCY SILENT"
        User time (seconds): 9148.97
        System time (seconds): 212.47
        Percent of CPU this job got: 97%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 2:40:34
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 2474540
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 1
        Minor (reclaiming a frame) page faults: 13058533
        Voluntary context switches: 226054
        Involuntary context switches: 108827
        Swaps: 0
        File system inputs: 1111056
        File system outputs: 300051776
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 1
sam bam • 2.1k views
ADD COMMENT
2
Entering edit mode

Have you tried BEDTools' bamtofastq?

ADD REPLY
0
Entering edit mode

Hi Kevin, Thanks for your reply. This is the part of the pipeline I am using for many projects and can't use other tools to do this.

ADD REPLY
0
Entering edit mode

Pierre will likely pick this up when he logs in

Exception in thread "main" htsjdk.samtools.util.RuntimeIOException: Write error; BinaryCodec in writemode; streamed file (filename not available)

ADD REPLY
1
Entering edit mode

Write failures are usually either network issues or disk space issues. Is your disk full or disk quota limit reached?

ADD REPLY
0
Entering edit mode

You are not really helping us by not including all necessary information in your question. For example, there are a dozen bash variables for which we don't know the values.

If you read the output log really carefully, you will see what I suspect is the cause of the failure. See this snippet:

Command exited with non-zero status 1
    Command being timed: "/usr/bin/java -Xms2g -Xmx6g -XX:+UseSerialGC -Dpicard.useLegacyParser=false -jar /usr/bin/picard.jar RevertSam -I /RAW/WGS/test^LP6005117-DNA_G04^test_WGS/LP6005117-DNA_G04.bam -O /dev/stdout -SORT_ORDER queryname -COMPRESSION_LEVEL 0 -VALIDATION_STRINGENCY SILENT"

Look at the input file name, it has ^ characters in a folder name:

-I /RAW/WGS/test^LP6005117-DNA_G04^test_WGS/LP6005117-DNA_G04.bam
ADD REPLY
0
Entering edit mode

I don't think it's because of how the folder is named. We name our folders by SampleName^Barcode^Project. It was certainly not the case with other numerous projects we have and we used the same pipeline for those projects.

ADD REPLY
1
Entering edit mode

This particular BAM may be corrupted. Have you tried regenerating or obtaining another copy?

ADD REPLY
0
Entering edit mode

What does ls -lh ${BAMFILE} shows?

Is this an internally developed pipeline?

ADD REPLY
0
Entering edit mode

Yes. It is an internally developed pipeline. We have been using this for thousands of samples without any problem. Just encountered this error with these particular bam files.

This is the output of ls -lh ${BAMFILE}:

-rw-r----- 1 c lab 85G Jul 14  2014 /RAW/WGS/test^LP6005117-DNA_G04^test_WGS/LP6005117-DNA_G04.bam
ADD REPLY

Login before adding your answer.

Traffic: 1842 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6