extract nanopore reads from the bam/sam file

0

Entering edit mode

6.1 years ago

kanshenglong • 0

Hi, After mapping the nanopore reads on the reference Genome, I want to extract raw reads from the bam/sam file. Can someone help me here? Thank you in advance!!

genome alignment next-gen • 3.9k views

ADD COMMENT • link 6.1 years ago by kanshenglong • 0

1

Entering edit mode

samtools fastq should work. See inline help for options.

That said if you did the mapping yourself don't you have the original files? Or are you looking to get only mapped/unmapped reads?

ADD REPLY • link 6.1 years ago by GenoMax 147k

0

Entering edit mode

Yes. I did this mapping by myself. And I am looking forward to get only mapped/unmapped reads. I mean to filter the nanopore reads which is identity with plastid genome by graphmap and samtools. So I can prevent from nuclear contamination when I assemble plastid genome. Thanks for your help!

ADD REPLY • link 6.1 years ago by kanshenglong • 0

0

Entering edit mode

This information should have been in the original posting.

Take a look at: How To Filter Mapped Reads With Samtools

ADD REPLY • link 6.1 years ago by GenoMax 147k

0

Entering edit mode

What puzzles me is the command of filtering illumina reads same as nanopore reads? Thanks!

ADD REPLY • link 6.1 years ago by kanshenglong • 0

0

Entering edit mode

Fastq format is a standard. Fastq headers will be different for illumina and nanopore.

If your reads are soft clipped (or not fully present in your alignment file) then you may want to gather just the nanopore headers of the two kinds and then retrieve the full reads from original data file.

ADD REPLY • link 6.1 years ago by GenoMax 147k

0

Entering edit mode

Thanks for your useful answer. And would you like to teach me to gather the headers of the mapped reads and retrieve the full reads from original data file.

ADD REPLY • link 6.1 years ago by kanshenglong • 0

1

Entering edit mode

Filter the reads using the strategy above and then you could extract the read headers (will start with ^@something so grep for that pattern) and put them into a new file. Use the answer here to pull out those reads from original file: A: Extracting specific sequences from FASTQ using Seqtk