How To Move After Denovo Contig Assembly Of Bacterial Seq
1
2
Entering edit mode
11.9 years ago
kanwarjag ★ 1.2k

I am relatively new in de-novo assembly, so please accommodate my question if it is very basic and appear to be naïve. I a working on MI-seq data 2X150bp of a bacterial strain. I used Valvvt (51kmer) and it gave me contigs which looks like this:

NODE253250length121cov1.338843 TGCCTGCTCTTCTGCTTTTCTACCATGTTATGATGCAGTATGAACGCCCTTGCCAGAAGCTGCTGC NODE253255length105cov1.000000 TGGAAGCCCCACTCTCAGTATTGACGTGCAAGTTCACAGTCTGGTTCCTGCCCCCGCGGT------

I have a reference genome of bacteria too. Now I want to pin point in which sample bacteria is present or not. Based on my literature reading- since genome is small I performed denovo assembly. However how from the above contigs I will found out which one is best and useful and showed that bacterial is present? What parameters should I be using- length of contig or something else to find out which one has to be more useful? If I use blast align pairwise alignment with reference, it takes a while and return an error message Bad Gateway perhaps the contig file is large (69523word). Any suggestion or pointers will be highly appreciable.

denovo assembly miseq • 3.8k views
ADD COMMENT
2
Entering edit mode

It's troubling that you have a coverage of 1 on some of these contigs in the assembly. You should try out VelvetOptimiser if you are comfortable on the command line. Or, you should remove these low coverage contigs (maybe anything<10 and length<150).

ADD REPLY
1
Entering edit mode

Can you be a little more clear with what you research question is? Are you sequencing a pure culture of an unknown bacteria (why are you asking "now i want to pin point which sample bacteria is present or not")?

Sounds like you are asking two questions here: one about your methodology and another about your problem with BLAST. I think you need to clearly define your methodology and research question first. Second, we can try to figure out why you are having a BLAST error.

ADD REPLY
1
Entering edit mode

No it is one basic q- I ran seq on samples and want to see if a particular bacteria is absent/ present. Have done denovo assembly of seq data- to generate contig. How should I handle these contigs to point out which one is significant and is corresponding to reference bacteria?

ADD REPLY
2
Entering edit mode
11.9 years ago
Lee Katz ★ 3.2k

I think that you'd just

  1. Make a blast database of your assembly
  2. Create a fasta list of the genes that you are looking for (maybe from a reference genome--just something similar)
  3. BLAST against your database with your genes.

The assembly is done. This is a question of presence or absence, which you can get from BLAST. You should do this on a command line and not the pairwise BLAST web page.

ADD COMMENT

Login before adding your answer.

Traffic: 2162 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6