Mapping two WGS Data
1
0
Entering edit mode
7.9 years ago

Say I am having two WGS data in fasta format. I want to find the common regions among them. How I can proceed further ? Whether I need to assemble two wgs files separately before proceeding ? Please help me in this aspect.

next-gen sequence blast Assembly • 1.8k views
ADD COMMENT
1
Entering edit mode

What is a common region? How long should a sequence be to be considered long enough to be common?

ADD REPLY
0
Entering edit mode

I want to infer is there any gene transfers between two organisms through sequence comparison. No idea about the common regions. The regions of genes which got transferred will have the common region.

ADD REPLY
1
Entering edit mode

Try to include that information in your initial post the next time. Be as informative as possible!

What type of organism are you working on? Small prokaryotic genome? Assembling your reads would be the most informative, but that may or may not be possible with the data you have.

ADD REPLY
0
Entering edit mode

Ok Wouter Thanks ! I am interested in assembling a microbiome data to a human genome. Could you please suggest a possible workflow ?

ADD REPLY
0
Entering edit mode

That's important information. Is this a particular human genome or would any human genome do? I assume the microbiome data consist of many species? And you just want to check if parts of the microbiome match to the human genome?

ADD REPLY
1
Entering edit mode
7.9 years ago
Vivek ★ 2.7k

You need to give more information. What is the read length? Are there multiple insert size libraries? Your choice of assembler would depend on that.

Ideally you'd assemble them separately and align them using something like BLAT.

ADD COMMENT
1
Entering edit mode

I'm not sure why this answer was accepted. It's helpful, but not a complete solution in my understanding. As such you signal that you consider this question solved.

ADD REPLY
0
Entering edit mode

Thanks Wouter for your comments... I would also like to have your suggestion please.

ADD REPLY
0
Entering edit mode

I added a comment with questions to your original post above. It's unclear what the purpose is of your analysis.

ADD REPLY
0
Entering edit mode

Thanks vivek ! Ya it contains multiple insert size. The sequence length ranges from 50 - 140. Any idea of using Magic BLAST in my case ?? for mapping two WGS data sets directly without assembling them separately. Please correct me if I am wrong.

ADD REPLY
1
Entering edit mode

You still need a reference genome to align using Magic Blast, which you do not have if you are comparing two sets of sequencing reads against each other. If I were you, I'd assemble the reads using SOAPdenovo and then align the resulting assemblies using BLAT.

ADD REPLY
0
Entering edit mode

Thanks vivek.... :)

ADD REPLY

Login before adding your answer.

Traffic: 1591 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6