Entering edit mode
8.4 years ago
aylward.megan
▴
10
I have sequencing reads from a non-human primate sequencing project, but also have some human contamination. To remove the contaminants I would like to align the reads to both human and non-human reference genomes then select only the reads which primary alignment is to the non-human primate genome. I know that particular regions can be sub-set using samtools view. However the non-human genome has over 3 million scaffolds is there a way to select all reads by genome rather than by scaffold?
Use BBSplit.sh from BBMap to bin the reads: How to remove contamination from NGS data
http://seqanswers.com/forums/showthread.php?t=41288
Ok thanks for your suggestion.