Say I am having two WGS data in fasta format. I want to find the common regions among them. How I can proceed further ? Whether I need to assemble two wgs files separately before proceeding ? Please help me in this aspect.
Say I am having two WGS data in fasta format. I want to find the common regions among them. How I can proceed further ? Whether I need to assemble two wgs files separately before proceeding ? Please help me in this aspect.
You need to give more information. What is the read length? Are there multiple insert size libraries? Your choice of assembler would depend on that.
Ideally you'd assemble them separately and align them using something like BLAT.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
What is a common region? How long should a sequence be to be considered long enough to be common?
I want to infer is there any gene transfers between two organisms through sequence comparison. No idea about the common regions. The regions of genes which got transferred will have the common region.
Try to include that information in your initial post the next time. Be as informative as possible!
What type of organism are you working on? Small prokaryotic genome? Assembling your reads would be the most informative, but that may or may not be possible with the data you have.
Ok Wouter Thanks ! I am interested in assembling a microbiome data to a human genome. Could you please suggest a possible workflow ?
That's important information. Is this a particular human genome or would any human genome do? I assume the microbiome data consist of many species? And you just want to check if parts of the microbiome match to the human genome?