Hi. I am currently analysing variation data of various species. I first worked on 1000 genome data, for which I downloaded the vcf files (from 1000 genomes) and then filtered out all non-coding sequences. I did this using exon coorindates downloaded from the UCSC table browser (in .bed format).
Now I am trying to do the same thing for Denisovan and Altai Neanderthal variation data (downloaded from here: http://cdna.eva.mpg.de/neandertal/altai/). However, UCSC does not appear to provide table browser data for either species, and I cannot locate exon co-ordinates anywhere.
The site that provides the data does have exome data (here: http://cdna.eva.mpg.de/neandertal/exomes/). However, the .bed file contains the coordinates of the longest transcript of the protein-coding genes analyzed in El Sidron, Vindija and Altai Neandertals all in one file. As such I cannot isolate the Altai Neanderthal exon co-ordinates. There is also nothing there for Denisovans.
Is anyone able to point me in the right direction for the exon co-orindates, or if not an alternative approach? Many thanks.
Just to clarify, the coordinates for the VCF files for the Neanderthal and Denisovan are computed on hg19. The coordinates should not be different.
Thank you. As the Neanderthal and Denisovan sequences are mapped to hg19, does this mean I can use the hg19 exome co-ordinates to filter my vcf files?
without knowing what you want to do exactly, yes, that would give you the exonic regions in the archaic humans.
Consider this: A: Looking for neanderthal genomes to download
I don't think you are going to find contiguous genome data for these ancient genomes.