Question

number of matching bases in set of sequences aligning with one genome

0

Entering edit mode

7.0 years ago

gtrwst9 • 0

Hello, I have metagenomic reads from the gut, and I have one genome of a bacterium which I know is present. Is there a tool that can give me the number of bases in the genome that are matched by the reads in the file? Note: blasting against the genome and adding all hits would count duplicate hits and overlaps.

alignment sequence genome • 1.1k views

ADD COMMENT • link updated 7.0 years ago by GenoMax 153k • written 7.0 years ago by gtrwst9 • 0

score 2 · Accepted Answer · 2018-08-25

2

Entering edit mode

7.0 years ago

GenoMax 153k

What you are asking for is not feasible. Any method used will be imperfect since it can't uniquely assign every read (let alone bases) that belongs to your organism of choice, especially when it is present in a metagenome.

One option is to try mash and RefSeq or just your genome. You could also use bbsplit.sh from BBMap to bin your reads so the ones that best map to your genome can be separated. bbsplit gives you multiple options to handle reads that multi-map within and across references genomes.

ADD COMMENT • link 7.0 years ago by GenoMax 153k

0

Entering edit mode

Many thanks. I was thinking in too exact terms, but it's all about probability so Mash and the returned p-value suits me fine.

ADD REPLY • link 7.0 years ago by gtrwst9 • 0