I have a pair of mapped reads that look like this:
SRR8441376.12918543 99 g4_3 1773574 45 40S25M10S = 1773574 25 CAGCGTGAGCGGTTCGCTGAAGTCCGGCCCCAGTTCGCACATTTCGGTGACGAACTGCAACACCCGGTCTTGTTC AAAAAEEEAEEEEEEEEEE6EEEEEEEEEEEEEEEEEEEEAEEEEEEAE<EEA/EEEE//E/EEAEE/EEEAAAA MD:Z:19T5 PG:Z:MarkDuplicates NM:i:1 AS:i:20 XS:i:0 SRR8441376.12918543 147 g4_3 1773574 45 26S25M23S = 1773574 -25 CGCTGAAGTCCGGCCCCAGTTCGCACATTTCGGTGACGAACTGCAACACCCGGTCTTGTTCCGCCTGGTAGTCC EEE/EEEEEE<EE<EEEEEEEEEEEEAEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAAAA MD:Z:19T5 PG:Z:MarkDuplicates NM:i:1 AS:i:20 XS:i:0
The TLEN looks to be 25bp? I have trimmed the reads. I have done a blast of these reads to the genome and the read only aligns for 25bp. Does this mean that the rest of the read is adapter?
how about your previous questions ? take some time to comment and validate the answers.
Insert size of ~20bp with ATAC-seq? ; Understanding 9th field of Bam file - insert size? ; How can I use featurecounts after generating a bam file using BWA? ; Using FeatureCounts for ChIP-seq normalised files? ; ....
25 bps is what your CIGAR tells you as well. more on CIGARs in the specs, reading is key ;) find adapters blasting against ncbi NR or use fastqc