Entering edit mode
7.8 years ago
dario.garvan
▴
520
Many researchers simply take a gene database (e.g. GENCODE Genes) and mapped reads (e.g. a BAM file) and produce a table of counts. Some genes in the table are have more ambiguity than others regarding the reads that were assigned to their boundaries. For example, in a human dataset I'm analysing, some reads which map to a particular HLA gene actually originate from another gene in the same family, which is observed when considering the alternative contigs of the human genome assembly. How can accurate gene-level (not allele-level) summaries (counts) of such polymorphic regions be obtained?
I am afraid that especially genes coding for immune receptors are too complex to be analyzed with conventional high thru put methods. SNP data of these genes in e.g., 1000 genomes are not reliable and analyzing these data asks for different approaches. I think there are for HLA special kits available.