error in samfile header
1
0
Entering edit mode
9.4 years ago
bioguy24 ▴ 230

I am getting an error in Picard using a modified bed file that was generated using. I apologize for the long post, just trying to be thorough as I can not figure it out. Thank you :).

samtools view -H IonXpress_009_150603.bam > header.sam
cat header.sam your_file.bed > new_file.bed

Example Bed file for BI=

@HD VN:1.4  GO:none SO:coordinate
@SQ SN:chr1 LN:249250621
@SQ SN:chr2 LN:243199373
@SQ SN:chr3 LN:198022430
@SQ SN:chr4 LN:191154276
@SQ SN:chr5 LN:180915260
@SQ SN:chr6 LN:171115067
@SQ SN:chr7 LN:159138663
@SQ SN:chr8 LN:146364022
@SQ SN:chr9 LN:141213431
@SQ SN:chr10    LN:135534747
@SQ SN:chr11    LN:135006516
@SQ SN:chr12    LN:133851895
@SQ SN:chr13    LN:115169878
@SQ SN:chr14    LN:107349540
@SQ SN:chr15    LN:102531392
@SQ SN:chr16    LN:90354753
@SQ SN:chr17    LN:81195210
@SQ SN:chr18    LN:78077248
@SQ SN:chr19    LN:59128983
@SQ SN:chr20    LN:63025520
@SQ SN:chr21    LN:48129895
@SQ SN:chr22    LN:51304566
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
@SQ SN:chrM LN:16569
@RG ID:8AH6U.IonXpress_009  PL:IONTORRENT   PU:Unspecified/P1.1.17/IonXpress_009    FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG DT:2015-06-03T14:03:02-0700 SM:E1   PG:tmap KS:TCAGTGAGCGGAACGAT    CN:TorrentServer/Proton
@RG ID:8AH6U.IonXpress_009.1    PL:IONTORRENT   PU:Unspecified/P1.1.17/IonXpress_009    FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG DT:2015-06-03T13:27:05-0700 SM:E1   PG:tmap KS:TCAGTGAGCGGAACGAT    CN:TorrentServer/Proton
@RG ID:8AH6U.IonXpress_009.10   PL:IONTORRENT   PU:Unspecified/P1.1.17/IonXpress_009

Example Bed file for TI=

@HD VN:1.4  GO:none SO:coordinate
@SQ SN:chr1 LN:249250621
@SQ SN:chr2 LN:243199373
@SQ SN:chr3 LN:198022430
@SQ SN:chr4 LN:191154276
@SQ SN:chr5 LN:180915260
@SQ SN:chr6 LN:171115067
@SQ SN:chr7 LN:159138663
@SQ SN:chr8 LN:146364022
@SQ SN:chr9 LN:141213431
@SQ SN:chr10    LN:135534747
@SQ SN:chr11    LN:135006516
@SQ SN:chr12    LN:133851895
@SQ SN:chr13    LN:115169878
@SQ SN:chr14    LN:107349540
@SQ SN:chr15    LN:102531392
@SQ SN:chr16    LN:90354753
@SQ SN:chr17    LN:81195210
@SQ SN:chr18    LN:78077248
@SQ SN:chr19    LN:59128983
@SQ SN:chr20    LN:63025520
@SQ SN:chr21    LN:48129895
@SQ SN:chr22    LN:51304566
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
@SQ SN:chrM LN:16569
@RG ID:8AH6U.IonXpress_009  PL:IONTORRENT   PU:Unspecified/P1.1.17/IonXpress_009    FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG DT:2015-06-03T14:03:02-0700 SM:E1   PG:tmap KS:TCAGTGAGCGGAACGAT    CN:TorrentServer/Proton
@RG ID:8AH6U.IonXpress_009.1    PL:IONTORRENT   PU:Unspecified/P1.1.17/IonXpress_009    FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG DT:2015-06-03T13:27:05-0700 SM:E1   PG:tmap KS:TCAGTGAGCGGAACGAT    CN:TorrentServer/Proton
@RG ID:8AH6U.IonXpress_009.10   PL:IONTORRENT   PU:Unspecified/P1.1.17/IonXpress_009

Error:

dnascopev@ubuntu:/media/C2F8EFBFF8EFAFB9$ picard-tools CalculateHsMetrics BI=sam_sort_5column_xgen_probes.bed TI=sam_sort_5column_xgen_targets.bed I=IonXpress_009_150603.bam O=IonXpress_009_150603_all_IDT.CalculateHSmetrics
[Tue Jun 23 14:07:49 CDT 2015] net.sf.picard.analysis.directed.CalculateHsMetrics BAIT_INTERVALS=sam_sort_5column_xgen_probes.bed TARGET_INTERVALS=sam_sort_5column_xgen_targets.bed INPUT=IonXpress_009_150603.bam OUTPUT=IonXpress_009_150603_all_IDT.CalculateHSmetrics    TMP_DIR=/tmp/dnascopev VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
[Tue Jun 23 14:07:50 CDT 2015] net.sf.picard.analysis.directed.CalculateHsMetrics done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=138412032
Exception in thread "main" java.lang.IllegalArgumentException: Program record with group id bc already exists in SAMFileHeader!
    at net.sf.samtools.SAMFileHeader.addProgramRecord(SAMFileHeader.java:197)
    at net.sf.samtools.SAMTextHeaderCodec.parsePGLine(SAMTextHeaderCodec.java:150)
    at net.sf.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:90)
    at net.sf.picard.util.IntervalList.fromReader(IntervalList.java:181)
    at net.sf.picard.util.IntervalList.fromFile(IntervalList.java:152)
    at net.sf.picard.analysis.directed.HsMetricsCalculator.<init>(HsMetricsCalculator.java:83)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.doWork(CalculateHsMetrics.java:83)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:158)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.main(CalculateHsMetrics
picard • 5.6k views
ADD COMMENT
0
Entering edit mode

Does your header have any @PG lines?

ADD REPLY
0
Entering edit mode

The bam file was from Ion Torrent and was used to grab the headers used in the SAMTools conversion. Does that help? Thank you :).

ADD REPLY
0
Entering edit mode

No, just look and see if any lines start with @PG. If there are any, post them all.

ADD REPLY
0
Entering edit mode

As Devon said, the problem is with the @PG tag. The error says that the group id "bc" has been duplicated for the PG tags. All you need to do is run samtools view -H IonXpress_009_150603.bam | grep "^@PG" | cut -f2 | sort | uniq -c and see if you frequency for particular ID greater than 1 (Just to warn you, the tmap aligned bam file will have plenty of PG IDs). The problem is interesting as I thought Picard only cared about RGIDs to be unique but damn it also cares about PG Ids too be unique too. Have you tried setting the "VALIDATION_STRINGENCY" to be "LENIENT". It may or may not help. Otherwise you can just remove @PG tags from the header of the BAM file and the bed files and rerun the command.

ADD REPLY
0
Entering edit mode

So try: (sorry I am still learning). Thank you :).

picard-tools \
  CalculateHsMetrics \
  BI=sam_sort_5column_xgen_probes.bed \
  TI=sam_sort_5column_xgen_targets.bed \
  I=IonXpress_009_150603.bam \
  VALIDATION_STRINGENCY=LENIENT \
  O=IonXpress_009_150603_all_IDT.CalculateHSmetrics

or to remove all occurrences of @PG

sed 's/@PG//' BI.bed > newBI.bed
ADD REPLY
0
Entering edit mode

Hey, dont be sorry. This is how you learn. Actually this problem is new to me too. I would go with the first option. That is running with VALIDATION_STRINGENCY=LENIENT. If it still throws the same error, go with VALIDATION_STRINGENCY=SILENT. Let me know if it doesn't work.

ADD REPLY
0
Entering edit mode

I tried both and got the same error:

dnascopev@ubuntu:/media/C2F8EFBFF8EFAFB9$ picard-tools CalculateHsMetrics BI=sam_sort_5column_xgen_probes.bed TI=sam_sort_5column_xgen_targets.bed I=IonXpress_009_150603.bam VALIDATION_STRINGENCY=LENIENT O=IonXpress_009_150603_all_IDT.CalculateHSmetrics
[Wed Jun 24 13:01:24 CDT 2015] net.sf.picard.analysis.directed.CalculateHsMetrics BAIT_INTERVALS=sam_sort_5column_xgen_probes.bed TARGET_INTERVALS=sam_sort_5column_xgen_targets.bed INPUT=IonXpress_009_150603.bam OUTPUT=IonXpress_009_150603_all_IDT.CalculateHSmetrics VALIDATION_STRINGENCY=LENIENT    TMP_DIR=/tmp/dnascopev VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
[Wed Jun 24 13:01:25 CDT 2015] net.sf.picard.analysis.directed.CalculateHsMetrics done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=140705792
Exception in thread "main" java.lang.IllegalArgumentException: Program record with group id bc already exists in SAMFileHeader!
    at net.sf.samtools.SAMFileHeader.addProgramRecord(SAMFileHeader.java:197)
    at net.sf.samtools.SAMTextHeaderCodec.parsePGLine(SAMTextHeaderCodec.java:150)
    at net.sf.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:90)
    at net.sf.picard.util.IntervalList.fromReader(IntervalList.java:181)
    at net.sf.picard.util.IntervalList.fromFile(IntervalList.java:152)
    at net.sf.picard.analysis.directed.HsMetricsCalculator.<init>(HsMetricsCalculator.java:83)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.doWork(CalculateHsMetrics.java:83)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:158)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.main(CalculateHsMetrics.java:68)
dnascopev@ubuntu:/media/C2F8EFBFF8EFAFB9$ picard-tools CalculateHsMetrics BI=sam_sort_5column_xgen_probes.bed TI=sam_sort_5column_xgen_targets.bed I=IonXpress_009_150603.bam VALIDATION_STRINGENCY=SILENT O=IonXpress_009_150603_all_IDT.CalculateHSmetrics
[Wed Jun 24 13:01:55 CDT 2015] net.sf.picard.analysis.directed.CalculateHsMetrics BAIT_INTERVALS=sam_sort_5column_xgen_probes.bed TARGET_INTERVALS=sam_sort_5column_xgen_targets.bed INPUT=IonXpress_009_150603.bam OUTPUT=IonXpress_009_150603_all_IDT.CalculateHSmetrics VALIDATION_STRINGENCY=SILENT    TMP_DIR=/tmp/dnascopev VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
[Wed Jun 24 13:01:56 CDT 2015] net.sf.picard.analysis.directed.CalculateHsMetrics done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=139788288
Exception in thread "main" java.lang.IllegalArgumentException: Program record with group id bc already exists in SAMFileHeader!
    at net.sf.samtools.SAMFileHeader.addProgramRecord(SAMFileHeader.java:197)
    at net.sf.samtools.SAMTextHeaderCodec.parsePGLine(SAMTextHeaderCodec.java:150)
    at net.sf.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:90)
    at net.sf.picard.util.IntervalList.fromReader(IntervalList.java:181)
    at net.sf.picard.util.IntervalList.fromFile(IntervalList.java:152)
    at net.sf.picard.analysis.directed.HsMetricsCalculator.<init>(HsMetricsCalculator.java:83)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.doWork(CalculateHsMetrics.java:83)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:158)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.main(CalculateHsMetrics.java:68)

Thank you for your help :).

ADD REPLY
0
Entering edit mode

Ok so we need to remove @PG tags from the header of the SAM file. The error is in the SAM header. The easiest way would be to output the current header in a text file and then remove lines that start with @PG tag. Then use samtools reheader to create a new bam file. See samtools reheader. Another thing is that why you are appending the header of the bam file to your BI and TI bed files. Bed file doesn't need any a SAM header. It should just follow chr start end strand, so I will remove the sam header from the bed file. First try deleting sam header from the bed files and run the command. If it doesn't work try removing @PG tags from the bam file as I suggested above. Let me know.

ADD REPLY
0
Entering edit mode

I removed all occurrences of @PG in the header and tried running picard

picard-tools CalculateHsMetrics BI=sam_sort_5column_xgen_probes.bed TI=sam_sort_5column_xgen_targets.bed I=IonXpress_009_150603.bam  O=IonXpress_009_150603_all_IDT.CalculateHSmetrics

picard-tools CalculateHsMetrics BI=sam_sort_5column_xgen_probes.bed TI=sam_sort_5column_xgen_targets.bed I=IonXpress_009_150603.bam  VALIDATION_STRINGENCY=SILENT O=IonXpress_009_150603_all_IDT.CalculateHSmetrics

Both gave the same error:

Exception in thread "main" java.lang.IllegalArgumentException: Cannot add sequence that already exists in SAMSequenceDictionary: chr1
    at net.sf.samtools.SAMSequenceDictionary.setSequences(SAMSequenceDictionary.java:62)
    at net.sf.samtools.SAMSequenceDictionary.<init>(SAMSequenceDictionary.java:40)
    at net.sf.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:106)
    at net.sf.picard.util.IntervalList.fromReader(IntervalList.java:181)
    at net.sf.picard.util.IntervalList.fromFile(IntervalList.java:152)
    at net.sf.picard.analysis.directed.HsMetricsCalculator.<init>(HsMetricsCalculator.java:83)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.doWork(CalculateHsMetrics.java:83)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:158)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.main(CalculateHsMetrics.java:68)

Its a new error so it looks like @PG was removed, hopefully it is closer. Thank you :).

ADD REPLY
0
Entering edit mode

That would seem to indicate that you have duplicate @SQ lines. Why not reproduce this on a ~1000 line file and post that to github so we have look at a working (well, not working) example?

ADD REPLY
0
Entering edit mode

I am just repeating what Devon has already indicated. It seems that there are duplicates (chr1) in your @SQ lines. Check the header of your new bam file.

ADD REPLY
0
Entering edit mode

I am not sure how to post on github (watching tutorials now).

header.txt below was created using this command:

samtools view -H IonXpress_009_150603.bam > header.txt

then removed the unwanted text

sed -i -e '/@PG/d' header.txt

followed by to create the BI= and TI=

cat header.txt your_file.bed > new_file.bed

header.txt

@HD    VN:1.4    GO:none    SO:coordinate
@SQ    SN:chr1    LN:249250621
@SQ    SN:chr2    LN:243199373
@SQ    SN:chr3    LN:198022430
@SQ    SN:chr4    LN:191154276
@SQ    SN:chr5    LN:180915260
@SQ    SN:chr6    LN:171115067
@SQ    SN:chr7    LN:159138663
@SQ    SN:chr8    LN:146364022
@SQ    SN:chr9    LN:141213431
@SQ    SN:chr10    LN:135534747
@SQ    SN:chr11    LN:135006516
@SQ    SN:chr12    LN:133851895
@SQ    SN:chr13    LN:115169878
@SQ    SN:chr14    LN:107349540
@SQ    SN:chr15    LN:102531392
@SQ    SN:chr16    LN:90354753
@SQ    SN:chr17    LN:81195210
@SQ    SN:chr18    LN:78077248
@SQ    SN:chr19    LN:59128983
@SQ    SN:chr20    LN:63025520
@SQ    SN:chr21    LN:48129895
@SQ    SN:chr22    LN:51304566
@SQ    SN:chrX    LN:155270560
@SQ    SN:chrY    LN:59373566
@SQ    SN:chrM    LN:16569
@RG    ID:8AH6U.IonXpress_009    PL:IONTORRENT    PU:Unspecified/P1.1.17/IonXpress_009    FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG    DT:2015-06-03T14:03:02-0700    SM:E1    PG:tmap    KS:TCAGTGAGCGGAACGAT    CN:TorrentServer/Proton
@RG    ID:8AH6U.IonXpress_009.1    PL:IONTORRENT    PU:Unspecified/P1.1.17/IonXpress_009    FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG    DT:2015-06-03T13:27:05-0700    SM:E1    PG:tmap    KS:TCAGTGAGCGGAACGAT    CN:TorrentServer/Proton
@RG    ID:8AH6U.IonXpress_009.10    PL:IONTORRENT    PU:Unspecified/P1.1.17/IonXpress_009    FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG    DT:2015-06-03T11:44:53-0700    SM:E1    PG:tmap    KS:TCAGTGAGCGGAACGAT    CN:TorrentServer/Proton
@RG    ID:8AH6U.IonXpress_009.11    PL:IONTORRENT    PU:Unspecified/P1.1.17/IonXpress_009    FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG    DT:2015-06-03T11:39:13-0700    SM:E1    PG:tmap    KS:TCAGTGAGCGGAACGAT    CN:TorrentServer/Proton
@RG    ID:8AH6U.IonXpress_009.12    PL:IONTORRENT    PU:Unspecified/P1.1.17/IonXpress_009

Thank you :)

ADD REPLY
0
Entering edit mode

Github was just an example, you can use anything else (dropbox, google drive, pastebin, etc.). We just need a small example that completely reproduces all the error message. That means we need at least a few alignments too.

ADD REPLY
0
Entering edit mode

Well this is what not I asked you to do. I didn't ask you to replace the new header in the bed files but in the original bam file and create a new bam file. Also, you don't need header in the bed files. Your bed file should follow this format (https://genome.ucsc.edu/FAQ/FAQformat.html#format1). To reiterate

  1. Create a new bam file using the new header. Use samtools reheader function.
  2. Remove header info from the bed file. Do you have any other data in the bed files? I mean other than the bam headers.
ADD REPLY
0
Entering edit mode

Is the reheader below right (or close)

samtools view -H IonXpress_009.bam | sed -e 's/header.txt/' | samtools reheader - IonXpress_009.bam > IonXpress_009_newhead.bam

to create a new bam without the @PG (header.txt has the @PG removed)

The original format of the bed file was:

chr12    9220367    9220487    +
chr12    9220739    9220859    +
chr12    9221325    9221445    +
chr12    9221328    9221448    +

I sorted that file and added a 5th column concatenating chr:start-end

Should I have just used the original?

Thank you :).

ADD REPLY
0
Entering edit mode

Your bed files look fine. You can have more than 4 columns. The bed files should be sorted which you did. So you can use the original bed files. About the header, you can do it step by step. One step answer would be:

samtools view -H Input.bam | sed '/^@PG/d' | samtools reheader - Input.bam > Input_newheader.bam
ADD REPLY
0
Entering edit mode

I am getting closer know thanks to your patience and expertise:

dnascopev@ubuntu:/media/C2F8EFBFF8EFAFB9$ picard-tools CalculateHsMetrics BI=sort_5column_xgen_probes.bed TI=sort_5column_xgen_targets.bed I=IonXpress_009_150603_newheader.bam  O=IonXpress_009_150603_all_IDT.CalculateHSmetrics
[Thu Jun 25 14:58:52 CDT 2015] net.sf.picard.analysis.directed.CalculateHsMetrics BAIT_INTERVALS=sort_5column_xgen_probes.bed TARGET_INTERVALS=sort_5column_xgen_targets.bed INPUT=IonXpress_009_150603_newheader.bam OUTPUT=IonXpress_009_150603_all_IDT.CalculateHSmetrics    TMP_DIR=/tmp/dnascopev VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
[Thu Jun 25 14:58:52 CDT 2015] net.sf.picard.analysis.directed.CalculateHsMetrics done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=77791232
Exception in thread "main" java.lang.NullPointerException
    at net.sf.picard.util.IntervalList.fromReader(IntervalList.java:187)
    at net.sf.picard.util.IntervalList.fromFile(IntervalList.java:152)
    at net.sf.picard.analysis.directed.HsMetricsCalculator.<init>(HsMetricsCalculator.java:83)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.doWork(CalculateHsMetrics.java:83)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:158)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.main(CalculateHsMetrics.java:68)

Is this an error in the way I invoke? Should it be

java -jar picard-tools \
  CalculateHsMetrics \
  BI=sort_5column_xgen_probes.bed \
  TI=sort_5column_xgen_targets.bed \
  I=IonXpress_009_150603_newheader.bam \
  O=IonXpress_009_150603_all_IDT.CalculateHSmetrics

or

picard-tools \
  CalculateHsMetrics\
   BI=sort_5column_xgen_probes.bed \
  TI=sort_5column_xgen_targets.bed\
  I=IonXpress_009_150603_newheader.bam\
  O=IonXpress_009_150603_all_IDT.CalculateHSmetrics\
  R=/home/dnascopev/Desktop/hg19_fasta/hg19.fasta

Thank you :)

ADD REPLY
1
Entering edit mode

The first one is correct. I think you should spend some time and go through Picard manual. Make sure all the files are sorted and indexed.

ADD REPLY
0
Entering edit mode

I sorted and indexed the bed files and posted the headers of them below. I am trying to read the manual but the site is blocked by our firewall so I have to use my phone. Thank you :).

dnascopev@ubuntu:/media/C2F8EFBFF8EFAFB9$ picard-tools CalculateHsMetrics BI=sort_index_5column_xgen_probes.bed TI=sort_index_5column_xgen_targets.bed I=IonXpress_009_150603_newheader.bam  O=IonXpress_009_150603_all_IDT.CalculateHSmetrics[Fri Jun 26 08:32:55 CDT 2015] net.sf.picard.analysis.directed.CalculateHsMetrics BAIT_INTERVALS=sort_index_5column_xgen_probes.bed TARGET_INTERVALS=sort_index_5column_xgen_targets.bed INPUT=IonXpress_009_150603_newheader.bam OUTPUT=IonXpress_009_150603_all_IDT.CalculateHSmetrics    TMP_DIR=/tmp/dnascopev VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
[Fri Jun 26 08:32:55 CDT 2015] net.sf.picard.analysis.directed.CalculateHsMetrics done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=61079552
Exception in thread "main" java.lang.IllegalStateException: Interval list file must contain header.
    at net.sf.picard.util.IntervalList.fromReader(IntervalList.java:176)
    at net.sf.picard.util.IntervalList.fromFile(IntervalList.java:152)
    at net.sf.picard.analysis.directed.HsMetricsCalculator.<init>(HsMetricsCalculator.java:83)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.doWork(CalculateHsMetrics.java:83)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:158)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.main(CalculateHsMetrics.java:68)

when I run:

cat header.txt sort_index_5column_xgen_probes.bed > sam_sort_index_5column_xgen_probes.bed

to put in the sam header I get:

dnascopev@ubuntu:/media/C2F8EFBFF8EFAFB9$ picard-tools CalculateHsMetrics BI=sam_sort_index_5column_xgen_probes.bed TI=sam_sort_index_5column_xgen_targets.bed I=IonXpress_009_150603_newheader.bam  O=IonXpress_009_150603_all_IDT.CalculateHSmetrics
[Fri Jun 26 08:29:34 CDT 2015] net.sf.picard.analysis.directed.CalculateHsMetrics BAIT_INTERVALS=sam_sort_index_5column_xgen_probes.bed TARGET_INTERVALS=sam_sort_index_5column_xgen_targets.bed INPUT=IonXpress_009_150603_newheader.bam OUTPUT=IonXpress_009_150603_all_IDT.CalculateHSmetrics    TMP_DIR=/tmp/dnascopev VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
[Fri Jun 26 08:29:35 CDT 2015] net.sf.picard.analysis.directed.CalculateHsMetrics done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=90308608
Exception in thread "main" net.sf.picard.PicardException: Invalid interval record contains 4 fields: chr1    133573    133692    chr1:133573-133692
    at net.sf.picard.util.IntervalList.fromReader(IntervalList.java:192)
    at net.sf.picard.util.IntervalList.fromFile(IntervalList.java:152)
    at net.sf.picard.analysis.directed.HsMetricsCalculator.<init>(HsMetricsCalculator.java:83)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.doWork(CalculateHsMetrics.java:83)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:158)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.main(CalculateHsMetrics.java:68)

sort_index_5column_xgen_probes.bed

chr1    133573    133692    chr1:133573-133692
chr1    659937    660056    chr1:659937-660056
chr1    809529    809649    chr1:809529-809649
chr1    955542    955662    chr1:955542-955662

sam_sort_index_5column_xgen_probes.bed

@HD    VN:1.4    GO:none    SO:coordinate
@SQ    SN:chr1    LN:249250621
@SQ    SN:chr2    LN:243199373
@SQ    SN:chr3    LN:198022430
@SQ    SN:chr4    LN:191154276
@SQ    SN:chr5    LN:180915260
@SQ    SN:chr6    LN:171115067
@SQ    SN:chr7    LN:159138663
@SQ    SN:chr8    LN:146364022
@SQ    SN:chr9    LN:141213431
@SQ    SN:chr10    LN:135534747
@SQ    SN:chr11    LN:135006516
@SQ    SN:chr12    LN:133851895
@SQ    SN:chr13    LN:115169878
@SQ    SN:chr14    LN:107349540
@SQ    SN:chr15    LN:102531392
@SQ    SN:chr16    LN:90354753
@SQ    SN:chr17    LN:81195210
@SQ    SN:chr18    LN:78077248
@SQ    SN:chr19    LN:59128983
@SQ    SN:chr20    LN:63025520
@SQ    SN:chr21    LN:48129895
@SQ    SN:chr22    LN:51304566
@SQ    SN:chrX    LN:155270560
@SQ    SN:chrY    LN:59373566
@SQ    SN:chrM    LN:16569
@RG    ID:8AH6U.IonXpress_009    PL:IONTORRENT    PU:Unspecified/P1.1.17/IonXpress_009    FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG    DT:2015-06-03T14:03:02-0700    SM:E1    PG:tmap    KS:TCAGTGAGCGGAACGAT    CN:TorrentServer/Proton
@RG    ID:8AH6U.IonXpress_009.1    PL:IONTORRENT    PU:Unspecified/P1.1.17/IonXpress_009
chr1    133573    133692    chr1:133573-133692
chr1    659937    660056    chr1:659937-660056
chr1    809529    809649    chr1:809529-809649
chr1    955542    955662    chr1:955542-955662
ADD REPLY
0
Entering edit mode

Perhaps you have a blank line at the end of your BED file (the error pertains to that file)?

ADD REPLY
0
Entering edit mode

Thank you for all your help, it is working now :).

ADD REPLY
0
Entering edit mode

Thank you for all your help, it is working now :).

ADD REPLY
0
Entering edit mode
9.4 years ago
bioguy24 ▴ 230

It looks like there are a lot of @PG in both the BI= and TI= bed files.

Example of BI=

@PG    ID:bc    PN:BaseCaller    VN:4.4-11/b4969eb    CL:BaseCaller --barcode-filter 0.01 --barcode-filter-minreads 10 --keypass-filter on --phasing-residual-filter=2.0 --num-unfiltered 1000 --barcode-filter-postpone 1 --calibration-file basecaller_results/recalibration/hpTable.txt --phase-estimation-file basecaller_results/recalibration/BaseCaller.json --model-file basecaller_results/recalibration/hpModel.txt --input-dir=sigproc_results --librarykey=TCAG --tfkey=ATCG --run-id=8AH6U --output-dir=basecaller_results --block-col-offset 11592 --block-row-offset 1332 --datasets=basecaller_results/datasets_pipeline.json --trim-adapter ATCACCGACTGCCCATAGAGAGGCTGAGAC
@PG    ID:bc.1    PN:BaseCaller    VN:4.4-11/b4969eb    CL:BaseCaller --barcode-filter 0.01 --barcode-filter-minreads 10 --keypass-filter on --phasing-residual-filter=2.0 --num-unfiltered 1000 --barcode-filter-postpone 1 --calibration-file basecaller_results/recalibration/hpTable.txt --phase-estimation-file basecaller_results/recalibration/BaseCaller.json --model-file basecaller_results/recalibration/hpModel.txt --input-dir=sigproc_results --librarykey=TCAG --tfkey=ATCG --run-id=8AH6U --output-dir=basecaller_results --block-col-offset 3864 --block-row-offset 2664 --datasets=basecaller_results/datasets_pipeline.json --trim-adapter ATCACCGACTGCCCATAGAGAGGCTGAGAC
@PG    ID:bc.10    PN:BaseCaller    VN:4.4-11/b4969eb    CL:BaseCaller --barcode-filter 0.01 --barcode-filter-minreads 10 --keypass-filter on --phasing-residual-filter=2.0 --num-unfiltered 1000 --barcode-filter-postpone 1 --calibration-file basecaller_results/recalibration/hpTable.txt --phase-estimation-file basecaller_results/recalibration/BaseCaller.json --model-file basecaller_results/recalibration/hpModel.txt --input-dir=sigproc_results --librarykey=TCAG --tfkey=ATCG --run-id=8AH6U --output-dir=basecaller_results --block-col-offset 6440 --block-row-offset 6660 --datasets=basecaller_results/datasets_pipeline.json --trim-adapter ATCACCGACTGCCCATAGAGAGGCTGAGAC
@PG    ID:bc.11    PN:BaseCaller    VN:4.4-11/b4969eb    CL:BaseCaller --barcode-filter 0.01 --barcode-filter-minreads 10 --keypass-filter on --phasing-residual-filter=2.0 --num-unfiltered 1000 --barcode-filter-postpone 1 --calibration-file basecaller_results/recalibration/hpTable.txt --phase-estimation-file basecaller_results/recalibration/BaseCaller.json --model-file basecaller_results/recalibration/hpModel.txt --input-dir=sigproc_results --librarykey=TCAG --tfkey=ATCG --run-id=8AH6U --output-dir=basecaller_results --block-col-offset 5152 --block-row-offset 6660 --datasets=basecaller_results/datasets_pipeline.json --trim-adapter ATCACCGACTGCCCATAGAGAGGCTGAGAC
@PG    ID:bc.12    PN:BaseCaller    VN:4.4-11/b4969eb    CL:BaseCaller --barcode-filter 0.01 --barcode-filter-minreads 10 --keypass-filter on --phasing-residual-filter=2.0 --num-unfiltered 1000 --barcode-filter-postpone 1 --calibration-file basecaller_results/recalibration/hpTable.txt --phase-estimation-file basecaller_results/recalibration/BaseCaller.json --model-file basecaller_results/recalibration/hpModel.txt --input-dir=sigproc_results --librarykey=TCAG --tfkey=ATCG --run-id=8AH6U --output-dir=basecaller_results --block-col-offset 9016 --block-row-offset 6660 --datasets=basecaller_results/datasets_pipeline.json --trim-adapter ATCACCGACTGCCCATAGAGAGGCTGAGAC

Example of TI=

Is there a way to attach files here because in each there are many @PG. Thank you :).

ADD COMMENT

Login before adding your answer.

Traffic: 2049 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6