Picard Samtofastq Only Extracts One Read, Then Throws An Error
2
0
Entering edit mode
12.4 years ago
Davy ▴ 410

I'm trying to extract some FastQ files from bam files. Picard can do this with SamToFastq as it says in the documentation for this tool it accepts either a bam or sam file.

But when I run it, it only extracts one read, and then exits. Here is the error message. Any help is appreciated.

[davy@xxxx picard-tools-1.70]$ java -jar SamToFastq.jar I=/home/davy/xxx_trio_data/xxxx-1.bam F=/home/davy/xxx_trio_data/1005-1.fastq
    [Wed Jun 20 14:14:21 BST 2012] net.sf.picard.sam.SamToFastq INPUT=/home/davy/xxx_trio_data/xxxx-1.bam FASTQ=/home/davy/xxxx_trio_data/xxxx-1.fastq    OUTPUT_PER_RG=false RE_REVERSE=true INCLUDE_NON_PF_READS=false READ1_TRIM=0 READ2_TRIM=0 INCLUDE_NON_PRIMARY_ALIGNMENTS=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
    [Wed Jun 20 14:14:21 BST 2012] Executing as davy@xxxxx.xxxx.xxxx.xx.xx on Linux 2.6.34.9-69.fc13.x86_64 amd64; OpenJDK 64-Bit Server VM 1.6.0_18-b18; Picard version: 1.70(1215)
    [Wed Jun 20 14:14:21 BST 2012] net.sf.picard.sam.SamToFastq done. Elapsed time: 0.00 minutes.
    Runtime.totalMemory()=2029715456
    FAQ:  http://sourceforge.net/apps/mediawiki/picard/index.php?title=Main_Page
    Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
            at java.util.ArrayList.rangeCheck(ArrayList.java:571)
            at java.util.ArrayList.get(ArrayList.java:349)
            at net.sf.picard.sam.SamToFastq.doWork(SamToFastq.java:156)
            at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
            at net.sf.picard.sam.SamToFastq.main(SamToFastq.java:118)

Edit: First few alignments from SAM file:

  C010CACXX110731:5:1301:17327:162246     99      1       10025   0       76M     =       10045   95      TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC    A?DFFFD@DGGGE?CHGGD@DEFGD@DGFHD@EHHGB@FGEGF@EHHEE@DEHHF@CGHEE@EGGGDA;CDEC?DE   MD:Z:76 PG:Z:bwa.2      RG:Z:C010C.5    AM:i:0  NM:i:0  SM:i:0  MQ:i:20 OQ:Z:@@CFFFFFHHHHGFIJJIIJIGIIGIGIIJGGJJJI@GEHGIEIJIJGHIGGIJGJEHIGEHHHFFDF9>AAAAC=      UQ:i:0
    C010CACXX110731:3:2208:5779:47927       163     1       10028   0       76M     =       10343   362     CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCT    ;CEB=BDEFC=AEGFD?EDFFE<EGFHC@EGFGC?BHHGC>FFFFE>:FCDD5=BEGB?=DGDE@CC7@C:<>ED#   MD:Z:76 PG:Z:bwa.11     RG:Z:C010C.3    AM:i:0  NM:i:0  SM:i:0  MQ:i:0  OQ:Z:@@CFFFFFGGDBDHIFIGGIIIEHGGHDHEIIJDDDHHI@FHEBFHA;F==@38@AD>A9AEAHF>@2;>66;?B#      UQ:i:0
    C010CACXX110731:6:1104:11773:149879     99      1       10032   0       68M8S   =       10178   188     AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACC    ABDEGC@EEEGE@DFGEE@CGFFD?EGFED>EHCFD@CGEGE@EGADC?@@DFE?DEDEA<ECFGC?#########   MD:Z:68 PG:Z:bwa.19     RG:Z:C010C.6    AM:i:0  NM:i:0  SM:i:0  MQ:i:0  OQ:Z:@@@DFDFFDFHHHIIIGJIGIIIGEIIFGIGIJ@CFHGHGIGIGI>@BB=;=CGEGE@E>>CAED@D#########      UQ:i:0
    C010CACXX110731:5:1301:17327:162246     147     1       10045   20      76M     =       10025   -95     ACCCTACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAACCCTAACCCT    #B?>=>%DFG?>3HHI@B@IIHAA<GGJAA@HHG>:1III?@?IHI>?>GAI>ABHGH?B@IGH=;GGH==;FCD:   MD:Z:6A69       PG:Z:bwa.2      RG:Z:C010C.5    AM:i:0  NM:i:1  SM:i:20 MQ:i:0  OQ:Z:#?<5>9'@EA?;)FDFHC?HHD@@7CEHFBBIGD?9*IIHDD?HGF??:GAGGFCJIHEGFJIFD<HHFDB=F@@?      UQ:i:4
    B02FEACXX110730:1:1208:17291:87203      99      1       10050   0       56M20S  =       10180   167     AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACA    BBCFEB>CGEH>?=GFFD?EEFFD?FEFF/<;>C@D(5@@,>?DGH,F@FGGID?#####################   MD:Z:56 PG:Z:bwa.6      RG:Z:B02FE.1    AM:i:0  NM:i:0  SM:i:0  MQ:i:0  OQ:Z:@@@DD?DDHBH;F;DEGBGGCHGHIICEE):9:?:D)0:9)9BDGH(=FHIFFCF#####################      UQ:i:0
    B02FEACXX110730:4:2204:14605:132091     163     1       10050   0       58M18S  =       10180   174     AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCGTAACC    ;ABBD?<?CA?C?CEGEC?DFE9;4@?C=8<AEGAC@E;BCB>DBCF@BADFE>@<E###################   MD:Z:58 PG:Z:bwa.3      RG:Z:B02FE.4    AM:i:0  NM:i:0  SM:i:0  MQ:i:0  OQ:Z:?@@AD>D?FF=DFGGHAHIGGE88<A?C;3??FDABGF9????FB?F;@=BGG6F7C###################      UQ:i:0
    B02FEACXX110730:5:1302:15620:33259      99      1       10057   0       53M23S  =       10182   174     ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAAACCCTGACCCTAAGCCTAACCC    ACFGB@DEEEC=DEGGD@DGHGD@EFEEEAEDDEE?AEEFC@EFGGE@DHFB########################   MD:Z:53 PG:Z:bwa.4      RG:Z:B02FE.5    AM:i:0  NM:i:0  SM:i:0  MQ:i:0  OQ:Z:@CCFDFFDFDFAFGIIIJJIJJIJJGGEGCCFBFGA@FDFHCGGIIIIHIG?########################      UQ:i:0

So I can tentatively say it's not the BAM file. I managed to obtain an example bamfile from http://code.google.com/p/gasv/downloads/ And I still get the same error message. If anyone with a working version of picard could try this file and let me know whether it works or not, then, I might be able to say it's my install of picard, or my java install. Thanks for the help.

picard sam bam next-gen • 5.3k views
ADD COMMENT
1
Entering edit mode

since the error happens right away you should also post the first few lines of your samfile, it is very likely that your file is not right in some way

ADD REPLY
0
Entering edit mode

Thanks Istvan, but as I mentioned, I am using a bam file, the first few lines of the bam file will not be human readable. Maybe someone could point me in the direction of a sample BAM file that is known to work, and then I can verify if the problem is with my system, or with the file?

ADD REPLY
0
Entering edit mode

what I meant is to transform the BAM to SAM, then show it

ADD REPLY
0
Entering edit mode

I converted the BAM file to a SAM file using BAMtools. I tried to include the header section of the SAM file and the first few alignments in my post above but it exceeds the character limit. Instead I've included the first few alignment only, as if there were a problem with the header, I would have assumed picard would never have extracted the first read at all.

ADD REPLY
0
Entering edit mode

Something weird is going on with the formatting of my post above. Anyway. I downloaded another BAM file from here: http://code.google.com/p/gasv/downloads/list and ran that but it still gave me the same error. So prob not the file. Any other ideas?

ADD REPLY
2
Entering edit mode
12.4 years ago
Davy ▴ 410

As it turns out, the data is paired end, not single read as I had initially thought, and picard require a second outfile in this instance specified with the SECONDENDFASTQ option.

ADD COMMENT
0
Entering edit mode
12.4 years ago

I saw the exact same error, using picard tools 1.60, samtofastq, on a .bam file. Could it be a problem with how java is set up on some machines?

ADD COMMENT
0
Entering edit mode

I'd like someway to test this. As I said above, a sample bam file known to work would be ideal. They're big though, so finding one could be tricky.

ADD REPLY

Login before adding your answer.

Traffic: 2332 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6