Entering edit mode
6.1 years ago
kanshenglong
•
0
Hi, After mapping the nanopore reads on the reference Genome, I want to extract raw reads from the bam/sam file. Can someone help me here? Thank you in advance!!
samtools fastq
should work. See inline help for options.That said if you did the mapping yourself don't you have the original files? Or are you looking to get only mapped/unmapped reads?
Yes. I did this mapping by myself. And I am looking forward to get only mapped/unmapped reads. I mean to filter the nanopore reads which is identity with plastid genome by graphmap and samtools. So I can prevent from nuclear contamination when I assemble plastid genome. Thanks for your help!
This information should have been in the original posting.
Take a look at: How To Filter Mapped Reads With Samtools
What puzzles me is the command of filtering illumina reads same as nanopore reads? Thanks!
Fastq format is a standard. Fastq headers will be different for illumina and nanopore.
If your reads are soft clipped (or not fully present in your alignment file) then you may want to gather just the nanopore headers of the two kinds and then retrieve the full reads from original data file.
Thanks for your useful answer. And would you like to teach me to gather the headers of the mapped reads and retrieve the full reads from original data file.
Filter the reads using the strategy above and then you could extract the read headers (will start with
^@something
sogrep
for that pattern) and put them into a new file. Use the answer here to pull out those reads from original file: A: Extracting specific sequences from FASTQ using SeqtkThanks for your help!!