What exactly are unmappable regions?
1
1
Entering edit mode
22 months ago
DS ▴ 70

What exactly is "unmappable regions"? My understanding from some google searches is that they are some short regions on the gene that are difficult to map. Is this correct? If so, why are there short region and long region, aren't they randomly splitted?

Thank you.

region genome mappable genetics gene • 1.1k views
ADD COMMENT
3
Entering edit mode
22 months ago
mark.ziemann ★ 1.9k

In the genome, there is a lot of what is called "repetitive DNA", these are sequences that appear many times throughout the genome. For example LINE1 and Alu are two types of repetitive sequences, that make up a large fraction of the human genome. Naturally, repetitive DNA is processed in sequencing assays like WGS and ChIP-seq, but aligners have a hard time figuring out where the read comes from as the sequence could have originated from many different places. The same thing happens when there are paralogous genes with very similar sequences, the aligner can't exactly distinguish where the sequence originated. This is why in short read sequencing, a lot of reads are discarded from the analysis as we don't know the true genomic origin of those reads. Long read sequencing mostly avoids this problem.

ADD COMMENT
0
Entering edit mode

so instead of randomly put into one of the "predicted" genomic origin, we just discard all of them?

ADD REPLY
1
Entering edit mode

it depends on the alignment parameters you define - but i'd sat in most case a read would be aligned to multiple locations and assigned a lower "mapping quality" score

ADD REPLY

Login before adding your answer.

Traffic: 1627 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6