Hi,
I am interested in finding the sites of inserts (transposable elements) in a genome of interest. The genome have been sequenced using paired end sequencing, and thereafter an alignment was performed to an index created with the wildtype genome and the insert sequence.
samtools flagstat alignment.sorted.bam
2415270 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
2188174 + 0 mapped (90.60%:-nan%)
2415270 + 0 paired in sequencing
1207635 + 0 read1
1207635 + 0 read2
2187112 + 0 properly paired (90.55%:-nan%)
2187284 + 0 with itself and mate mapped
890 + 0 singletons (0.04%:-nan%)
158 + 0 with mate mapped to a different chr
12 + 0 with mate mapped to a different chr (mapQ>=5)
How should I go about finding the number and sites of inserts in this genome? I think I should focus on the reads with mates mapped to a different chromosome. Is it possible to filter by SAMFLAGS to obtain these read? If so, what are the SAMFLAGS I should use?
Thanks in advance for your help!
Identifying discordantly mapped reads
Hi,
Thank you for your reply!
I have tried as you suggested the following
However, in both cases I obtained the number 172, which does not tally with my flagstat output.