For simplicity let us say I have one sequence that is 400bp in length. I used STAR to map reads to this sequence and now I am interested in only keeping the read pairs where read 2 maps to the end of the sequence (position ~440-450 of 450). What is the best way to do this? I've been trying to see if its possible using PySAM but am having some issues.
Thanks!