Hi there. Im looking to extract chr, start coordinate of read 1 and the end coordinate of read 2 of paired-end NGS into one "bed" file. I have over 7 million reads and I am not sure that every paired-end read has a "pair". I have used bamtobed function of bedtools and sorted the file by read info (eg M01269....). Here is an example of what I need, For the following:
chrII 404128 404259 M01269:176:000000000-BW364:1:1101:10000:12221/1 60 -
chrII 404126 404251 M01269:176:000000000-BW364:1:1101:10000:12221/2 60 +
chrVII 350990 351120 M01269:176:000000000-BW364:1:1101:10000:24715/1 60 -
chrVII 350971 351093 M01269:176:000000000-BW364:1:1101:10000:24715/2 60 +
chrXII 527617 527747 M01269:176:000000000-BW364:1:1101:10000:26164/1 60 +
chrXII 527627 527753 M01269:176:000000000-BW364:1:1101:10000:26164/2 60 -
chrVII 826318 826449 M01269:176:000000000-BW364:1:1101:10000:8567/1 60 +
chrVII 826335 826461 M01269:176:000000000-BW364:1:1101:10000:8567/2 60 -
chrXII 880431 880562 M01269:176:000000000-BW364:1:1101:10001:14255/1 60 +
chrXII 880448 880574 M01269:176:000000000-BW364:1:1101:10001:14255/2 60 -
I need:
chrII 404128 404251
chrVII 350990 351093
chrXII 527617 527753
chrVII 826318 . 826461
chrXII 880431 880574
Any help would be very appreciated. Thanks
Hello rjobmc,
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thank you!
Hello Could you please explain what 60 and - or + stand for ?
probably length of the read and strand (+ and -) rania.hamdy1