How to run multiple alignment and SNP-calling of WGS data in .gb and .fasta using Python or Ruby/Java or any free software?
0
0
Entering edit mode
9.6 years ago

How to run multiple alignment of WGS data in .gb and .fasta formats using Python or Ruby/Java? Please advise some packages and tutorials. I could not find a tutorial on multiple alignment and SNP calling using Python and Ruby. All I could do is to use trial DNAStar, RidomSeqSphere and NextGene software. Are there any free similar software and a way to do it with a modern language? Thank you, Folks.

alignment sequencing software • 3.7k views
ADD COMMENT
0
Entering edit mode

Google for "biopython", specifically tutorials related to multiple alignments, which I recall it can make. SNP calling is normally a different matter, though you could parse the multiple alignment if you really wanted. There may not be a tutorial for it, so just figure it out.

ADD REPLY
0
Entering edit mode

I have BioPython and BioRuby, it is not enough, can you propose something more effective?

ADD REPLY
0
Entering edit mode

Are there any free analogues of the software I mentioned in my question?

ADD REPLY
0
Entering edit mode

The tools you mention are all GUI tools. You say you have BioPython and BioRuby but they are not enough (which is near impossible, seeing how they provide means to work with almost all bioinformatics cmd line tools). Quick question: How much programming experience do you have?

ADD REPLY
0
Entering edit mode

Yeah, if a GUI is needed then webtools should be used. There are web-based versions for many of the MSA tools.

Edit: Or there's Galaxy, which I presume also provides them.

ADD REPLY
0
Entering edit mode

But why use web-based tools in the first place when it's far more efficient and scalable to learn command line usage?

ADD REPLY
0
Entering edit mode

It's a question of how many times this needs to be done. If it's just a handful, then there's no point in bothering with any scripting or even the command line. If this needs to be done many times, then absolutely the command line or a specific script is needed.

ADD REPLY
0
Entering edit mode

Either of those should be sufficient. There are biopython tutorials on creating MSAs. Anything after that you might have to code a bit yourself (or not, it'll depend on what you want to do). Biopython itself is using freely available tools for all of this (biopython is just a convenient wrapper in this case).

ADD REPLY
0
Entering edit mode

Thanks, I know how to use Python and Ruby and functions from packages. I spent 7 years programming and learning computer sciences. I almost never ask questions on programming methodology and practice. For WGS with a 4,5 million nps it seems not to be the best option. Can you propose a better solution?

ADD REPLY
0
Entering edit mode

Ah, whole genome changes things completely. Most MSA programs are oriented toward proteins (that's what MSA what originally designed around). I'd be surprised if biopython provided any facilities for things of that magnitude. You'll likely need to write your own wrappers. See this thread for pointers on where to start: Help With Multiple Whole Genome Alignment. Aligning Over 400 Whole Genomes

ADD REPLY
0
Entering edit mode

I have Bowtie, RBowtie, Mummer, Mugsy, I can't say they are all good and easy enough to use for my goals. Seems that all free is complete bullshit, or a partial one. I need a free analogue of RidomSeqSphere and NexGene software. That was my question. I don't like to push nails with a violin and flute instead of a heavy hammer.

ADD REPLY
1
Entering edit mode

Bowtie etc. are short read aligners, you can't hope for them to produce whole genome MSAs. Please see the thread I linked to.

ADD REPLY

Login before adding your answer.

Traffic: 1619 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6