Removal of host sequences without reference genome
1
0
Entering edit mode
3.4 years ago

Dear all, Suppose to have a collection of viral reads from NGS (Illumina) technology in fastq format. After the usual pre-processing step (addressed by fastp), I need to remove the host sequences (contaminants) without having the reference genome (I cannot use bowtie2 and samtools for mapping, of course). I have ready some approaches, but I am still not sure. Please, can someone suggest an appropriate strategy/starting point/approach? Thanks for your support.

meta-genomic contaminants missing host reference genome viral • 1.5k views
ADD COMMENT
0
Entering edit mode
3.4 years ago
Mensur Dlakic ★ 28k

You would help everyone by providing more details: what is the host, how big a genome vs. the viral genome, what is it you are considering, etc.

I have answered this question in several different contexts, so I will just give you links. I think it is worth reading all the posts in those pages.

ADD COMMENT
0
Entering edit mode

Hello, The goal of my project is to identify the correct taxonomy of the viral reads I have. We always know the host of our sample even if the reference genome is not available, like bat, rodents, human, or mosquito.

ADD REPLY

Login before adding your answer.

Traffic: 2191 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6