Hi,
I am running some analysis on bam files that I have downloaded from ICGC. I don't have the original fastaq files, just the bam. The bams was aligned using a different reference than the one I use in my pipeline - I use hg19 from UCSC, I am not sure which reference was used for the bam but I think this is the Ensmble reference. The result is different naming convention ('1' vs 'chr1', 'GL000241.1' vs 'chrUn_gl000241'), and different order of contigs. This causes problems, for example when working with GATK. What is the right way to handle such a situation?
Thanks, Michal.
Download the files your pipeline needs for that other reference, or extract read information from the BAM files and map them against your reference. I don't think there is an easy way out here.
Thanks a lot. I hoped that there is a simple way to do that, but I guess I'll have to work hard for that... :-/
It is additional work but not necessarily hard :)
Hopefully your bam has both mapped and unmapped reads. Otherwise you are missing a part of the original data.