Creating Hg19 Reference Index
2
1
Entering edit mode
12.5 years ago
Davy ▴ 410

I am creating the index reference in order to align my fastq reads. One question I have is should I include all the "funny" chromosomes like the chr6hapcox or chrUnxxxxxxx?

I will be using GATK downstream for other parts of the analysis, and I know it doesn't like when the chromosomes are out of order, so what should I do as regards these non-canonical chromosomes which don't have (at least to me) an implicit ordering?

bwa hg19 next-gen • 5.9k views
ADD COMMENT
8
Entering edit mode
12.5 years ago

See Heng Li 's page : http://lh3lh3.users.sourceforge.net/humanref.shtml

For variant discovery, RNA-seq and ChIP-seq, it is recommended to use the entire primary assembly, including assembled chromosomes AND unlocalized/unplaced contigs, for the purpose of read mapping. Not including unlocalized and unplaced contigs potentially leads to more mapping errors.

ADD COMMENT
0
Entering edit mode

I realise this, but the problem is it will cause GATK to throw an error when parsing the BAM files. Any ideas on how to get GATK to not break?

ADD REPLY
0
Entering edit mode

you can map with your favorite tool and filter the funny hits from the SAM/BAM file after that.

ADD REPLY
0
Entering edit mode
12.5 years ago
Rok ▴ 190

If you have your chromosome in separate fasta files you should merge everything into one fasta file (whole genome). Using Create Sequence Dictionary from Picard Tools you create a dictionary for this whole genome. This is going to store order of chromosomes in the whole genome file.

When you do mapping with a whole genome fasta file as an index most of the mapping software should produce SAM/BAM header that is the same as the dictionary file. If it happens to be different (TopHat sometimes seems to sort chromosomes a bit differently) you can use ReorderSam from Picard tools to sort SAM/BAM file in the same fashion as it's in the dictionary, be careful you provide ReorderSam with correct dictionary file.

ADD COMMENT

Login before adding your answer.

Traffic: 2000 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6