I recently learned Pal2Nal is a good tool to align nucleotide sequences for subsequent downstream analyses involving positive selection. But I don't know how to use it from command-line. The webserver of the tool require to upload protein alignments and corresponding nucleotide sequences(unaligned). I have thousands of sequences and it is impossible for me to use the web based tool and my boss has suggested I use the commandline version to which I should write a python script to do the alignment and "include some lines in the script to feed the resulting codon alignment to "codeml" program to calculate Ka, Ks values".
Could anyone please help me how to proceed? I can use python but I don't know what types of arguments are acceptable by a commandline version of the tool. Any resource or help will be appreciated. Thanks
@Bioji, thanks very much for the lead. After download I tried it first commandline to test if it works as so: [code] edson@samsung:~/pal2nal.v14$ perl pal2nal.pl test.aln test.nuc > -output paml -nogap [/code]
But things are yet to work out and this is what I get: [code] Can't open paml at pal2nal.pl line 335. [/code]
I think with your lead I'm a small step away but there one or few subtle issue I am missing. Could you help?
Thanks.
Ps. As a follow-up, it seems there are issues with the pal2nal output options. The default is "clustal" and if you call the commandline with default it just work. If you call with 'paml' or 'fasta' thats when "cant open paml at pal2nal.pl.." or "cant open fasta at pal2nal.pl.." log comes out.
How do you fix that?
I think you have an error in your command line. The '>' should be after specify the -output paml. Check again the command I wrote in the answer.