Hello!
Do mapped reads lay along chromosomes or along the whole genome? So, is it a rule that if the mapped read is long enough, it has to end just at the end of a chromosome? Or it can lay somewhere in between 2 neighbor chromosomes?
thanks!
Hello!
Do mapped reads lay along chromosomes or along the whole genome? So, is it a rule that if the mapped read is long enough, it has to end just at the end of a chromosome? Or it can lay somewhere in between 2 neighbor chromosomes?
thanks!
A read will basically only align at specific bases on a reference genome, but it is allowed for some of the bases to extend past the ends of the chromosomes, either in the start or end of the chromosomes. These can be marked as "soft clipped" bases in a read.
For example, if the first five bases of a chromosomes are GGGGG, then a read containing the sequence AAAAAGGGGG can align to the five G's at the start of the chromosome using this CIGAR string:
5S5M
That indicates the five G's in AAAAAGGGGG align to the first five bases of the reference genome, but the A's are "soft clipped" and extend past the start of the chromosome
Note also that "unaligned" reads can be stored in BAM files that do not align to the genome.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Genome is divided in separated chromosomes (thus different DNA molecules). Even if you sequence a very (very) long read longer than a whole chromosome, you will not sequence an other chromosome (with the same read). However you can have paired-end reads aligning to different chromosome e.g read1 to chrom1 and read2 to chrom2 in the case of chromosomal translocation.