GATK errors with RealignerTargetCreator and IndelRealignment
2
0
Entering edit mode
6.4 years ago
KG • 0

After trying the following command for RealignerTargetCreator, all of the output (the intervals) are displayed in the command terminal window, instead of being saved to the file:

java -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R hg19_woj.fa -I reads_Qfsorted.bam -o forIndelRealigner.intervals

The following is what I received before the output:

INFO  21:08:35,602 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.8-0-ge9d806836, Compiled 2017/07/28 21:26:50
INFO  21:08:35,602 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
INFO  21:08:35,610 HelpFormatter - For support and documentation go to https://software.broadinstitute.org/gatk
INFO  21:08:35,611 HelpFormatter - [Fri Jul 13 21:08:35 PDT 2018] Executing on Linux 4.15.0-23-generic amd64
INFO  21:08:35,611 HelpFormatter - OpenJDK 64-Bit Server VM 1.8.0_171-8u171-b11-0ubuntu0.18.04.1-b11
INFO  21:08:35,614 HelpFormatter - Program Args: -T RealignerTargetCreator -R hg19_woj.fa -I reads_Qfsorted.bam -o forIndelRealigner.intervals
INFO  21:08:35,616 HelpFormatter - Executing as as@as-VirtualBox on Linux 4.15.0-23-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_171-8u171-b11-0ubuntu0.18.04.1-b11.
INFO  21:08:35,618 HelpFormatter - Date/Time: 2018/07/13 21:08:35
INFO  21:08:35,619 HelpFormatter - ----------------------------------------------------------------------------------
INFO  21:08:35,619 HelpFormatter - ----------------------------------------------------------------------------------
ERROR StatusLogger Unable to create class org.apache.logging.log4j.core.impl.Log4jContextFactory specified in jar:file:/home/as/RNAP/GenomeAnalysisTK.jar!/META-INF/log4j-provider.properties
ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
INFO  21:08:35,836 GenomeAnalysisEngine - Deflater: IntelDeflater
INFO  21:08:35,836 GenomeAnalysisEngine - Inflater: IntelInflater
INFO  21:08:35,837 GenomeAnalysisEngine - Strictness is SILENT
INFO  21:08:36,027 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Cove

Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
INFO  21:10:02,508 GenomeAnalysisEngine - Deflater: IntelDeflater
INFO  21:10:02,510 GenomeAnalysisEngine - Inflater: IntelInflater
INFO  21:10:02,511 GenomeAnalysisEngine - Strictness is SILENT
INFO  21:10:02,742 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO  21:10:02,753 SAMDataSource$SAMReaders - Initializing SAMRecords in serial

This is a sample part of the output:

662-9316721-9320273-9320347, chrY-9320357-9320431-9321930-9322004, chrY-9322001-9322075-9322162-9322236, chrY-9322197-9322271-9357275-9357334-9360885-9360899, chrY-9322257-9322271-9357275-9357334-9360885-9360959, chrY-9323687-9323761-9344027-9344101, chrY-9325066-9325140-9325159-9325233, chrY-9325351-9325425-9326033-9326107, chrY-9326036-9326110-9326239-9326313, chrY-9326276-9326350-9326452-9326526, chrY-9326523-9326597-9326693-9326767, chrY-9326523-9326597-9326704-9326778, chrY-9326711-9326785-9327480-9327554, chrY-9336935-9337009-9340560-9340634, chrY-9336950-9337009-9340560-9340634, chrY-9340571-9340645-9342449-9342523, chrY-9340644-9340718-9342217-9342291, chrY-9342288-9342362-9342449-9342523, chrY-9342484-9342558-9342754-9342768, chrY-9345349-9345423-9345442-9345516, chrY-9345633-9345707-9346315-9346389, chrY-9346318-9346392-9346521-9346595, chrY-9346558-9346632-9346734-9346808, chrY-9346805-9346879-9346975-9347049, chrY-9346805-9346879-9346986-9347060, chrY-9346993-9347067-9347762-9347836, chrY-9357260-9357334-9360885-9360959, chrY-9357275-9357334-9360885-9360959, chrY-9360896-9360970-9362776-9362850, chrY-9360969-9361043-9362544-9362618, chrY-9362615-9362689-9362776-9362850, chrY-9362811-9362885-9363082-9363096, chrY-9365547-9365621-9365883-9365957, chrY-9365946-9366020-9366628-9366702, chrY-9366631-9366705-9366834-9366908, chrY-9366871-9366945-9367047-9367121, chrY-9367118-9367192-9367288-9367362, chrY-9367118-9367192-9367299-9367373, chrY-9367306-9367380-9367508-9367582, chrY-9367306-9367380-9368075-9368149, chrY-9374241-9374297-9382734-9382808, chrY-9382805-9382879-9382966-9383038-9383040-9383041, chrY-9382878-9382879-9382966-9383038-9383040-9383075-9383272-9383310, chrY-9383000-9383038-9383040-9383075-9383272-9383336-9383598-9383607, chrY-9383066-9383075-9383272-9383336-9383598-9383608-9383769-9383780-9384061-9384106-9384120-9384125, chrY-9383273-9383336-9383598-9383608-9383769-9383780-9384061-9384106-9384120-9384136, chrY-9383285-9383336-9383598-9383608-9383769-9383780-9384061-9384106-9384120-9384148, chrY-9383331-9383336-9383598-9383608-9383769-9383780-9384061-9384106-9384120-9384149-9384257-9384264-9384401-9384437, chrY-9384062-9384106-9384120-9384149-9384257-9384264-9384401-9384467, chrY-9384070-9384106-9384120-9384149-9384257-9384264-9384401-9384475, chrY-9448406-9448480-9450030-9450104, chrY-9448406-9448480-9452391-9452465, chrY-9450069-9450143-9452391-9452465, chrY-9452418-9452492-9452699-9452762, chrY-9528770-9528844-9529322-9529396, chrY-9529452-9529526-9531119-9531193, chrY-9529452-9529526-9531390-9531464, chrY-9544604-9544678-9544925-9544999, chrY-9544604-9544678-9546154-9546228, chrY-9545106-9545180-9546154-9546228, chrY-9546193-9546267-9548185-9548259, chrY-9546292-9546366-9548185-9548259, chrY-9548587-9548661-9549350-9549424, chrY-9549350-9549424-9551732-9551806, chrY-9551819-9551893-9552835-9552871, chrY-9555292-9555366-9558704-9558778, chrY-9574025-9574099-9579933-9580007, chrY-9579936-9580010-9582487-9582561, chrY-9582499-9582573-9589958-9590032, chrY-9590032-9590106-9590799-9590873, chrY-9590881-9590955-9593786-9593860, chrY-9590948-9591022-9598604-9598678, chrY-9593790-9593864-9593985-9594059, chrY-9594108-9594182-9595406-9595480, chrY-9598667-9598741-9601098-9601172, chrY-9601132-9601206-9608070-9608144, chrY-9608155-9608229-9611654-9611728, chrY-9638842-9638916-9642383-9642457, chrY-9642420-9642494-9646920-9646994, chrY-9646920-9646994-9647680-9647718-9650809-9650844, chrY-9646959-9646994-9647680-9647718-9650809-9650854, chrY-9650935-9651009-9653511-9653585, chrY-9653579-9653653-9654904-9654978, chrY-9748407-9748463-9748577-9748651, chrY-9748648-9748722-9749263-9749337, chrY-9904163-9904218-9904910-9904984]

Are these even intervals? And does anyone know why this is not generating a file??

I decided to just continue with the next step, Indel realignment, using the following command:

java -jar GenomeAnalysisTK.jar -T IndelRealigner -R hg19_woj.fa -I reads_sorted.bam -known chr20.vcf -targetIntervals forIndelRealigner.intervals -o realignedbam.bam

And am getting the following error:

ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
INFO  20:58:31,513 GenomeAnalysisEngine - Deflater: IntelDeflater
INFO  20:58:31,516 GenomeAnalysisEngine - Inflater: IntelInflater
INFO  20:58:31,517 GenomeAnalysisEngine - Strictness is SILENT
INFO  20:58:31,668 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO  20:58:31,672 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO  20:58:35,217 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 3.54
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 3.8-0-ge9d806836):
##### ERROR
##### ERROR This means that one or more arguments or inputs in your command are incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR
##### ERROR If the problem is an invalid argument, please check the online documentation guide
##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
##### ERROR
##### ERROR Visit our website and forum for extensive documentation and answers to
##### ERROR commonly asked questions https://software.broadinstitute.org/gatk
##### ERROR
##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
##### ERROR
##### ERROR MESSAGE: Lexicographically sorted human genome sequence detected in reference. Please see https://software.broadinstitute.org/gatk/documentation/article?id=1328for more information. Error details: reference contigs = [chr10, chr11, chr11_gl000202_random, chr12, chr13, chr14, chr15, chr16, chr17_ctg5_hap1, chr17, chr17_gl000203_random, chr17_gl000204_random, chr17_gl000205_random, chr17_gl000206_random, chr18, chr18_gl000207_random, chr19, chr19_gl000208_random, chr19_gl000209_random, chr1, chr1_gl000191_random, chr1_gl000192_random, chr20, chr21, chr21_gl000210_random, chr22, chr2, chr3, chr4_ctg9_hap1, chr4, chr4_gl000193_random, chr4_gl000194_random, chr5, chr6_apd_hap1, chr6_cox_hap2, chr6_dbb_hap3, chr6, chr6_mann_hap4, chr6_mcf_hap5, chr6_qbl_hap6, chr6_ssto_hap7, chr7, chr7_gl000195_random, chr8, chr8_gl000196_random, chr8_gl000197_random, chr9, chr9_gl000198_random, chr9_gl000199_random, chr9_gl000200_random, chr9_gl000201_random, chrM, chrUn_gl000211, chrUn_gl000212, chrUn_gl000213, chrUn_gl000214, chrUn_gl000215, chrUn_gl000216, chrUn_gl000217, chrUn_gl000218, chrUn_gl000219, chrUn_gl000220, chrUn_gl000221, chrUn_gl000222, chrUn_gl000223, chrUn_gl000224, chrUn_gl000225, chrUn_gl000226, chrUn_gl000227, chrUn_gl000228, chrUn_gl000229, chrUn_gl000230, chrUn_gl000231, chrUn_gl000232, chrUn_gl000233, chrUn_gl000234, chrUn_gl000235, chrUn_gl000236, chrUn_gl000237, chrUn_gl000238, chrUn_gl000239, chrUn_gl000240, chrUn_gl000241, chrUn_gl000242, chrUn_gl000243, chrUn_gl000244, chrUn_gl000245, chrUn_gl000246, chrUn_gl000247, chrUn_gl000248, chrUn_gl000249, chrX, chrY]

Does anyone know why this might be happening?

Thanks in advance for your help!

GATK software error • 5.7k views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

ADD REPLY
0
Entering edit mode

I decided to just continue with the next step

Not always a good idea to do that. Ensure no critical errors happen before you move to the next step.

ADD REPLY
0
Entering edit mode
6.4 years ago
Tm ★ 1.1k

for solving this error :

ERROR StatusLogger Unable to create class org.apache.logging.log4j.core.impl.Log4jContextFactory specified in jar:file:/home/as/RNAP/GenomeAnalysisTK.jar!/META-INF/log4j-provider.properties ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...

Try using version of release 3.8-1 of GATK. According to developer, they have solved this issue there. Check this link of GATK forum for more detail.

Looking at your command of RealignerTargetCreator and IndelRealigner, it seems you have not provided "-known" vcf file while running RealignerTargetCreator command whereas you used that in case of IndelRealigner, so its better to use the same file for both the tasks. See GATK documentation.

Lastly, for the error message below:

ERROR MESSAGE: Lexicographically sorted human genome sequence detected in reference. Please see https://software.broadinstitute.org/gatk/documentation/article?id=1328for more information. Error details: reference contigs = [chr10, chr11, chr11_gl000202_random, chr12, chr13, chr14, chr15, chr16, chr17_ctg5_hap1, chr17, chr17_gl000203_random, chr17_gl000204_random, chr17_gl000205_random, chr17_gl000206_random, chr18, chr18_gl000207_random, chr19, chr19_gl000208_random, chr19_gl000209_random, chr1, chr1_gl000191_random, chr1_gl000192_random, chr20, chr21, chr21_gl000210_random, chr22, chr2, chr3, chr4_ctg9_hap1, chr4, chr4_gl000193_random, chr4_gl000194_random, chr5, chr6_apd_hap1, chr6_cox_hap2, chr6_dbb_hap3, chr6, chr6_mann_hap4, chr6_mcf_hap5, chr6_qbl_hap6, chr6_ssto_hap7, chr7, chr7_gl000195_random, chr8, chr8_gl000196_random, chr8_gl000197_random, chr9, chr9_gl000198_random, chr9_gl000199_random, chr9_gl000200_random, chr9_gl000201_random, chrX, chrY]

Can you please check the order of chromosomes in Bam and your reference genome file? Because GATK is very much particular about it. Check this.

ADD COMMENT
0
Entering edit mode

Thanks a lot the first error has been resolved by installing version 3.8-1 of GATK. For Lexicographically sorted human genome sequence detected in reference error

As suggested I used ReorerSam used command

java -jar ReorderSam.jar I=read_qfsorted.bam O=reordered.bam R=hg19.fa CREATE_INDEX=TRUE

got error

To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp
Exception in thread "main" picard.PicardException: New reference sequence does not contain a matching contig for chr1-10002766-10002840-10003307-10003381 at
picard.sam.ReorderSam.buildSequenceDictionaryMap(ReorderSam.java:225) at 
picard.sam.ReorderSam.doWork(ReorderSam.java:106) at 
picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183) at 
picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:124) at 
picard.sam.ReorderSam.main(ReorderSam.java:85)

and its outputting something on terminal but did not produce the outputfile.bam

Please give advice.

Thanks in advance

ADD REPLY
0
Entering edit mode

Did you read the error? The problem is stated clearly in the error message.

ADD REPLY
0
Entering edit mode

I am not sure how to resolve the problem though

ADD REPLY
0
Entering edit mode

Thanks for all the help so far guys! I was able to get the GATK realignertargetcreator step to run. However, it has been running for about 2-3 hours at this point, and it is currently "preparing for traversal over 1 BAM files." Is that normal??

ADD REPLY
0
Entering edit mode

We'll need a little more context than that. Can you copy-paste the stdout content in case this is still an issue?

ADD REPLY

Login before adding your answer.

Traffic: 1695 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6