Question

Mapping two WGS Data

0

Entering edit mode

7.9 years ago

deepakkumarbt • 0

Say I am having two WGS data in fasta format. I want to find the common regions among them. How I can proceed further ? Whether I need to assemble two wgs files separately before proceeding ? Please help me in this aspect.

next-gen sequence blast Assembly • 1.8k views

ADD COMMENT • link updated 7.9 years ago by Vivek ★ 2.7k • written 7.9 years ago by deepakkumarbt • 0

1

Entering edit mode

What is a common region? How long should a sequence be to be considered long enough to be common?

ADD REPLY • link 7.9 years ago by WouterDeCoster 47k

0

Entering edit mode

I want to infer is there any gene transfers between two organisms through sequence comparison. No idea about the common regions. The regions of genes which got transferred will have the common region.

ADD REPLY • link 7.9 years ago by deepakkumarbt • 0

1

Entering edit mode

Try to include that information in your initial post the next time. Be as informative as possible!

What type of organism are you working on? Small prokaryotic genome? Assembling your reads would be the most informative, but that may or may not be possible with the data you have.

ADD REPLY • link 7.9 years ago by WouterDeCoster 47k

0

Entering edit mode

Ok Wouter Thanks ! I am interested in assembling a microbiome data to a human genome. Could you please suggest a possible workflow ?

ADD REPLY • link 7.9 years ago by deepakkumarbt • 0

0

Entering edit mode

That's important information. Is this a particular human genome or would any human genome do? I assume the microbiome data consist of many species? And you just want to check if parts of the microbiome match to the human genome?

ADD REPLY • link 7.9 years ago by WouterDeCoster 47k

score 1 · Answer 1 · 2016-12-29

1

Entering edit mode

7.9 years ago

Vivek ★ 2.7k

You need to give more information. What is the read length? Are there multiple insert size libraries? Your choice of assembler would depend on that.

Ideally you'd assemble them separately and align them using something like BLAT.

ADD COMMENT • link 7.9 years ago by Vivek ★ 2.7k

1

Entering edit mode

I'm not sure why this answer was accepted. It's helpful, but not a complete solution in my understanding. As such you signal that you consider this question solved.

ADD REPLY • link 7.9 years ago by WouterDeCoster 47k

0

Entering edit mode

Thanks Wouter for your comments... I would also like to have your suggestion please.

ADD REPLY • link 7.9 years ago by deepakkumarbt • 0

0

Entering edit mode

I added a comment with questions to your original post above. It's unclear what the purpose is of your analysis.

ADD REPLY • link 7.9 years ago by WouterDeCoster 47k

0

Entering edit mode

Thanks vivek ! Ya it contains multiple insert size. The sequence length ranges from 50 - 140. Any idea of using Magic BLAST in my case ?? for mapping two WGS data sets directly without assembling them separately. Please correct me if I am wrong.

ADD REPLY • link 7.9 years ago by deepakkumarbt • 0

1

Entering edit mode

You still need a reference genome to align using Magic Blast, which you do not have if you are comparing two sets of sequencing reads against each other. If I were you, I'd assemble the reads using SOAPdenovo and then align the resulting assemblies using BLAT.