Hi! I have a following problem. I have a multiple alignment that looks something like that:
SpeciesA --GTACCTAGGTACCT
SpeciesB AGGT---TAGGTACCT
SpeciesC AGGTACCTAGGT----
What I want to get is the following:
SpeciesA GTACCTAGGT
SpeciesB GT---TAGGT
SpeciesC GTACCTAGGT
i.e. gaps at the ENDS of the alignment trimmed off, the internal gaps preserved and all the sequences of the same length.
I've been looking everywhere to find a tool to help me but no luck so far. Most of the programmes like Gblocks, trimAl, t-coffee and similar can remove gapped columns but throughout the entire alignment. Due to that I lose the internal indels and the resulting alignment is like that:
SpeciesA GTTAGGT
SpeciesB GTTAGGT
SpeciesC GTTAGGT
Which I'm not interested in.
I know I can do it manually but I have about 5,000 alignment files... If I spent a minute on editing each of them I'd probably need 3,5 days of manual editing non-stop... That's way I'd appreciate any help VERY VERY much!
Cheers
You can edit the alignment and delete columns manually in Jalview and ClustalX.
Pretty sure Stack Overflow could give you a great awk one-liner in less than one minute. If they do, please post it back here as well :)