I have a lot of mate pair libraries with unknown reads orientation. What is the current fastest way to determine it? As some additional complication: some libraries don't have reference genome for mapping.
I have a lot of mate pair libraries with unknown reads orientation. What is the current fastest way to determine it? As some additional complication: some libraries don't have reference genome for mapping.
Fastest solution would be to ask the person who prepared the libraries :)
Serious: even if you don't have the reference genome, you can map the subset of sequences to the closest reference genome to test the orientation. Just make sure that you set up a stringent map settings (no soft clipping, no mismatches etc). Plotting the distances should give you rough estimate of insert size as well as orientation.
Edit: aren't all mate pairs RF and paired end FR oriented?
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I've moved this to an answer since I think this is about the best possible answer.
BTW: Yes, all mate-pairs should be RF and PE should be FR. One should note that mate-pair libraries will still have PE reads in them (and more than just a few).
I don't think your last note is necessarily true. I was assembling my strand-specific RNASeq reads with Trinity and I believe their orientation was RF (dUTP option).
Here, RF is just short-hand for "point away from each other", which is correct. PE reads will point toward each other. This is due purely to how the library construction works.