GATK Interval list for processing mouse genome WGS
1
0
Entering edit mode
6.4 years ago
husensofteng ▴ 410

I am trying to apply the GATK best practices for calling variants but from mice WGS data. According to GATK, one of the reasons for using an interval list in processing WGS data is "to exclude regions that have bad or uninformative data where a tool is getting stuck”. GATK bundle includes such an interval list fo the human genome however I have not been able to find any interval list of “good” genomic regions for the mouse genome (GRCm38).

Also, since GATK does not provide information on the provenance of this file it is hard to replicate it for the mouse genome. I am looking for an interval list for GRCm38 or information on the source of the human interval file. Any help is highly appreciated.

GATK WGS GRCm38 genome • 3.3k views
ADD COMMENT
0
Entering edit mode

Were you able to find this interval list?

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT and not the answer field for comments.

ADD REPLY
0
Entering edit mode
5.0 years ago
igor 13k

The calling regions file was generated using Picard IntervalListTools. The command is actually included in the file itself and described on the GATK support forum:

So it looks like it's taking the full genome, then removing the N bases (with no padding). The "call intervals" are the remaining good bases.

ADD COMMENT
0
Entering edit mode

If I'm not mistaken, this seems to require an interval list in the first place, which doesn't seem available for mouse

ADD REPLY

Login before adding your answer.

Traffic: 2512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6