Hi,
Are there any good straight forward pipelines/tools which can robustly remove the bacterial sequences from eukaryotic genomes ?
Suggestions please!
Hi,
Are there any good straight forward pipelines/tools which can robustly remove the bacterial sequences from eukaryotic genomes ?
Suggestions please!
It depends somewhat if you have a higher or lower eukaryote in your main assembly, but it is not a deal-breaker. Higher eukaryotes have such different usage of tetranucleotides that you can do simple 4n frequency binning with t-SNE. Prokaryotic contigs should be easy to separate - see an example here.
In the image below, three bacterial genomes are in the lower right corner and at 9 o'clock. Other small clusters are likely repetitive regions in eukaryotic DNA.
That is correct. I don't think there is any reason to expect "chimeric" contigs because it is unlikely that a eukaryote and a prokaryote have long stretches of near-identical DNA. This is especially the case for higher eukaryotes where coding density is very low. Even in complex environmental communities with many members that are considerably closer to each other than any eukaryote and any prokaryote could be, metagenomic assembly and binning are fairly successful.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Unless they are present as separate contigs/pieces there is no way you are going to be able to detect/remove the sequences.
Then, should I remove the contamination from raw reads and do assembly after cleaning ?
Again start from step1 very tedious.
Any suggestions please!