GATK IndelRealigner issue "TOO MANY READS"
1
1
Entering edit mode
9.4 years ago
ravi.uhdnis ▴ 220

Hi, I ran the following command for Indel realignments of GATK package, at some points it didn't attempted realignment due to "too many reads". So, what should I do now in order to realign reads of these intervals ?. Is it normal to have such issue while running this command of realignment ?. Looking for advice, Thanks.

/usr/bin/java -Djava.io.tmpdir=./tmp/ -Xmx8g -jar /usr/local/GenomeAnalysisTK.jar \
                                            -T IndelRealigner \
                                            -R /san/illumina_two/rsindhu_sge/Human_ref_genomes/GRCh38/FINAL/GRCh38.fa \
                                            -I ETH002102_CGATGT_L003_R1R2.bwa.sam.sort.dedup.mark.bam \
                                            -targetIntervals ETH002102_CGATGT_L003_R1R2.bwa.sam.sort.dedup.mark.bam.list \
                                            -o ETH002102_CGATGT_L003_R1R2.bwa.sam.sort.dedup.mark.realign.bam



INFO  16:53:33,833 HelpFormatter - --------------------------------------------------------------------------------
INFO  16:53:33,835 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.3-0-g37228af, Compiled 2014/10/24 01:07:22
INFO  16:53:33,836 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO  16:53:33,836 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO  16:53:33,840 HelpFormatter - Program Args: -T IndelRealigner -R /san/illumina_two/rsindhu_sge/Human_ref_genomes/GRCh38/FINAL/GRCh38.fa -I ETH002102_CGATGT_L003_R1R2.bwa.sam.sort.dedup.mark.bam -targetIntervals ETH002102_CGATGT_L003_R1R2.bwa.sam.sort.dedup.mark.bam.list -o ETH002102_CGATGT_L003_R1R2.bwa.sam.sort.dedup.mark.realign.bam
INFO  16:53:33,843 HelpFormatter - Executing as rsindhu@master on Linux 2.6.18-164.15.1.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_75-b13.
INFO  16:53:33,843 HelpFormatter - Date/Time: 2015/06/25 16:53:33
INFO  16:53:33,843 HelpFormatter - --------------------------------------------------------------------------------
INFO  16:53:33,843 HelpFormatter - --------------------------------------------------------------------------------
INFO  16:53:34,441 GenomeAnalysisEngine - Strictness is SILENT
INFO  16:53:34,492 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO  16:53:34,501 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO  16:53:34,516 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01
INFO  16:53:34,589 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
INFO  16:53:34,593 GenomeAnalysisEngine - Done preparing for traversal
INFO  16:53:34,594 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO  16:53:34,594 ProgressMeter -                 | processed |    time |    per 1M |           |   total | remaining
INFO  16:53:34,595 ProgressMeter -        Location |     reads | elapsed |     reads | completed | runtime |   runtime
INFO  16:53:35,074 ReadShardBalancer$1 - Loading BAM index data
INFO  16:53:35,075 ReadShardBalancer$1 - Done loading BAM index data
INFO  16:54:04,605 ProgressMeter -      1:33014475   1100011.0    30.0 s      27.0 s        1.1%    46.8 m      46.3 m
INFO  16:54:34,614 ProgressMeter -      1:71858066   2500025.0    60.0 s      24.0 s        2.3%    43.0 m      42.0 m
INFO  16:55:04,623 ProgressMeter -     1:109359649   3900039.0    90.0 s      23.0 s        3.5%    42.4 m      40.9 m
INFO  16:55:26,231 IndelRealigner - Not attempting realignment in interval 1:125180127-125180352 because there are too many reads.
INFO  16:55:26,616 IndelRealigner - Not attempting realignment in interval 1:125180467-125180761 because there are too many reads.
INFO  16:55:34,632 ProgressMeter -     1:143207956   4800100.0   120.0 s      25.0 s        4.6%    43.1 m      41.1 m
.....contd.....
INFO  17:01:34,740 ProgressMeter -       3:5813802   1.8290541E7     8.0 m      26.0 s       16.1%    49.7 m      41.7 m
INFO  17:02:04,749 ProgressMeter -      3:39162268   1.9490554E7     8.5 m      26.0 s       17.2%    49.5 m      41.0 m
INFO  17:02:34,757 ProgressMeter -      3:75444386   2.0790567E7     9.0 m      25.0 s       18.3%    49.1 m      40.1 m
INFO  17:02:55,372 IndelRealigner - Not attempting realignment in interval 3:93470414-93470789 because there are too many reads.
INFO  17:03:04,767 ProgressMeter -      3:98675310   2.1890714E7     9.5 m      26.0 s       19.1%    49.7 m      40.2 m
INFO  17:03:34,775 ProgressMeter -     3:131018111   2.3090726E7    10.0 m      25.0 s       20.1%    49.6 m      39.6 m
.....contd.....
INFO  17:51:36,188 ProgressMeter -      Y:11722137   1.08653337E8    58.0 m      32.0 s       98.5%    58.9 m      52.0 s
INFO  17:52:06,197 ProgressMeter -      Y:11722137   1.08653337E8    58.5 m      32.0 s       98.5%    59.4 m      52.0 s
INFO  17:52:36,205 ProgressMeter -      Y:56678628   1.0895334E8    59.0 m      32.0 s      100.0%    59.0 m       0.0 s
INFO  17:53:06,215 ProgressMeter -      Y:56886710   1.09353353E8    59.5 m      32.0 s      100.0%    59.5 m       0.0 s
INFO  17:53:20,168 ProgressMeter -            done   1.09986502E8    59.8 m      32.0 s      100.0%    59.8 m       0.0 s
INFO  17:53:20,168 ProgressMeter - Total runtime 3585.57 secs, 59.76 min, 1.00 hours

INFO  17:53:20,172 MicroScheduler - 0 reads were filtered out during the traversal out of approximately 109986502 total reads (0.00%)
INFO  17:53:20,172 MicroScheduler -   -> 0 reads (0.00% of total) failing MalformedReadFilter
INFO  17:53:20,849 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
INFO  17:53:20,850 HttpMethodDirector - Retrying request
INFO  17:53:20,854 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
INFO  17:53:20,855 HttpMethodDirector - Retrying request
INFO  17:53:20,859 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
INFO  17:53:20,859 HttpMethodDirector - Retrying request
INFO  17:53:20,864 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
INFO  17:53:20,864 HttpMethodDirector - Retrying request
INFO  17:53:20,869 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
INFO  17:53:20,869 HttpMethodDirector - Retrying request

I deleted some processing lines and wrote '....contd.....' as the question was going out of words limit.

Assembly alignment genome GATK next-gen • 4.8k views
ADD COMMENT
1
Entering edit mode

Yes, this is a normal issue. These regions could be part of copy number gain regions or some repetitive regions leading to non-uniform or very high coverage. Calling variants in these regions is usually unreliable and you should just ignore the warning unless you are particularly interested in that region. Alternatively, you can try downsampling your bam file using -dcov option in GATK (See here:https://www.broadinstitute.org/gatk/guide/tagged?tag=downsampling)

ADD REPLY
0
Entering edit mode
ADD REPLY
2
Entering edit mode
9.4 years ago
Zaag ▴ 870

Increase (some of) these parameters:

--maxConsensuses
--maxReadsForConsensuses
--maxReadsForRealignment
--maxReadsInMemory

I don't know the defaults for the GATK3.3 but that should be on their website.

ADD COMMENT

Login before adding your answer.

Traffic: 1672 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6