Bacterial de novo genome assembly
2
1
Entering edit mode
7.6 years ago
Paul ▴ 80

I have pair end Fastq files from illumina (Fast1.fastq and Fast2.fastq). Now, how should I proceed further with de-novo genome assembly? I'am completely new to this field of assembling genome sequences? Which open-end tools can be used?

Please help me with the tools and tutorials which can be used for de-novo assembly of sequences.

next-gen sequencing de-novo genome Assembly • 5.7k views
ADD COMMENT
1
Entering edit mode

Hi , You can read this article to start : http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0017915

Best

ADD REPLY
0
Entering edit mode

what did you try? There are plenty of tutorials out there. What organism do you have? Bacteria?

ADD REPLY
0
Entering edit mode

Yes it is for Bacteria

ADD REPLY
0
Entering edit mode

I think we need a little more details to help you. If you want to learn general procedures, well the literature is there for you as it was for us!

ADD REPLY
0
Entering edit mode

Ya general procedures with the softwares available

ADD REPLY
1
Entering edit mode

I spoke with friend who work on meta genomic Bacteria data and to him the most use software for bacteria assembly is SPADes.

Best

ADD REPLY
5
Entering edit mode
7.6 years ago

If you download the BBMap package, there is a (hopefully) helpful guide in bbmap/docs/guides/PreprocessingGuide.txt

For isolate bacterial assembly with SPAdes, I recommend:

1) Adapter-trimming (you can do quality-trimming at the same time; I suggest a low cutoff, such as Q10)

2) Artificial contaminant filtering

3) Human contaminant removal

4) Error correction

5) Paired-read merging

...then assemble with SPAdes, using both the merged and unmerged reads.

ADD COMMENT
1
Entering edit mode

check assembly quality with QUAST

ADD REPLY
0
Entering edit mode

Thanks a lot for the detailed explanation. Is there any best tool to do the bacterial genome assembly in windows?

ADD REPLY
1
Entering edit mode

You will have so much more options if you use Linux... Windows is a great operating system, but not for bioinformatics.

ADD REPLY
0
Entering edit mode

Look these: Best software to assemble bacterial genomes And also a list of tools available is here. You can check the one that work in windows amongst them.

ADD REPLY
1
Entering edit mode

I don't know of many off-hand. CLC Genomics Workbench will do assembly, but it isn't free. It's a decent one-stop-shop for pretty much everything however (and is GUI based if you're commandline averse).

ADD REPLY
2
Entering edit mode
7.6 years ago
Joe 21k

The general procedure I follow is:

  1. Quality control your data before you do anything else. FastQC is a go-to tool for this.
  2. Trim sequences to remove adapters etc if necessary. (See Seqtk, sickle, cutadapt, trimgalore and other such tools)
    • It may make sense to filter the reads out at this stage and throw away anything that is garbage.
  3. Assemble (See SOAP, Velvet, SPAdes)
  4. Re-map the reads to get coverage statistics etc (optional but advised). (See, Bowtie2, BWA, Qualimap etc)
  5. Optionally, if you have published reference genomes, you may wish to reorder contigs (See progressiveMauve/Mauve)
ADD COMMENT
1
Entering edit mode

Not to forget that you can use a reference assisted assembly if you consider that a nice reference genome can be used for this purpose

ADD REPLY
0
Entering edit mode

Indeed :) OP asked for de novo specifically, but if reference guided is desirable I believe programs like MIRA will do it.

ADD REPLY
0
Entering edit mode

Many other programs do it. In particular, Velvet and SOAP can use reference genomes to assist a de novo assembly.

In my hands. the using of a trusted reference genome for assisting a de novo assembly, gave better results than a de novo assembly that was processed by Mauve afterwards

ADD REPLY

Login before adding your answer.

Traffic: 1820 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6