Combination of two assemblies and get desired contigs
1
0
Entering edit mode
9.4 years ago
seta ★ 1.9k

Hi all,

First of all, please accept my apologize if you find the question is basic for you, bioinformatics experts. But, it's a challenge for me as a just biologist student, so please be patient. I would like to compare then combine two fasta assembly files generated by two assemblers; to this end, I did blastn with threshold of e-value of 1E-100 and identity of 98%. Assuming that A is contig ID of assembly 1 and B is contig ID of assembly 2, D is alignment length, M is query sequence(assembly 1) length, N is subject sequence (assembly 2) length. I want to if N < (M+200), keep A (and replace it with counterparts in the fasta file generated by assembly 2), if D=N and (M+200) <N, discard A and keep B. Could you please help me out on this issue? Thanks so much in advance.

A                     B                       C          D        E           F            G             H           I               J             K           L          M      N
query Id              subject Id              Identity   length   mismatich   gapopening   query start   query end   subject start   subject end   e-value     bitscore   qlen   slen
contig10002|m.12543   c26528_g1_i1|m.14066    100        762      0           0            28            789         1               762           0           1408       789    762
contig10003|m.12544   c39648_g1_i1|m.25685    100        945      0           0            1             945         1               945           0           1746       945    945
contig10003|m.12545   c39648_g1_i1|m.25685    100        336      0           0            1             336         780             445           2.00E-177   621        336    945
contig10004|m.12546   c54250_g1_i3|m.62628    100        462      0           0            1             462         1               462           0           854        462    468
contig10005|m.12547   c54760_g1_i3|m.64975    100        564      0           0            1             564         1               564           0           1042       564    564
contig10006|m.12548   c64049_g2_i2|m.128345   100        526      0           0            188           713         236             761           0           972        729    1089
alignment Assembly sequencing RNA-Seq • 1.9k views
ADD COMMENT
0
Entering edit mode

Instead of writing your own script, you could also try using GAM-NGS: http://www.ncbi.nlm.nih.gov/pubmed/23815503

It aligns reads to two similar genome assemblies and merges the two assemblies based on how well the reads align.

ADD REPLY
0
Entering edit mode

Thanks, but my issue is transcriptome assembly not genome assembly. Does it work fine for transcriptome assembly?

ADD REPLY
0
Entering edit mode
9.4 years ago
Sej Modha 5.3k

You can try using Transrate for transcriptome assembly comparison.

http://hibberdlab.com/transrate/metrics.html

ADD COMMENT
0
Entering edit mode

My main goal is combination of two assemblies rather than comparison

ADD REPLY

Login before adding your answer.

Traffic: 5559 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6