Entering edit mode
3.4 years ago
daewowo
▴
80
I want to extract mapped read ids (only) from multiple .sam files (where each sam file contains reads mapped to a unique set of genomes), then create new sam files that do not have any mapped read id's contained in other sam files.
Just a simple method to extract all the mapped read id's from a sam file would be a good start. But I haven't been able to find a tool to do this (could write this in python but would be horrendously slow)
Not sure I understand exactly what you want to do, but you can extract only mapped reads from a sam file based on the flag and using samtools:
see How to efficiently remove a list of reads from BAM file?