Entering edit mode
14 months ago
Hello,
I am working with DNA sequencing libraries and I have found out that there are a lot of split reads randomly distributed (not clustered) along the genome in my samples. When I do a Blast of the sequence of these reads, I always find this behaviour:
For some reason, the split region corresponds to the same DNA locus but it has with a different orientation.
Any clue? Thanks!
Are you working on regular sanger sequence data or some form of high throughput data? Remember BLAST is a local aligner so it is always going to find these local HSP's.
This is illumina Pair End, 350bp (150bp each read). However I detected my DNA fragments are smaller than expected for this library, do you think it is related?
Even if the inserts are small, if this is DNAseq, then the alignments should not split for a single read (unless you have some chimeric library fragments). Are you showing the alignment from one of the read pairs above?
Please use a proper NGS aligner rather than blast+ (unless you have a specific reason to do so).
The data was aligned with BWA-mem, then I used IGV and saw these split reads. To know the origin of the split region of the read, I used blast. As you say, the image corresponds to the alignment of one read pair.
As you suggest, I also thought about chimeras in my data but, if I understand this ok, I might think that chimeras would happen because of the union of any DNA fragment (even fragments from different chromosomes or very distant loci). However, I always see that the two parts of the same read belong to the same DNA region.