Entering edit mode
5.7 years ago
mittjohns
▴
30
What’s the most efficient (fast and simple code) way to convert/cut a fasta file (either part or the whole sequence) of a chromosome into two column tab-delimited format of pos and base? For example:
fasta file
>chr2
ATGCATTC...
converted pos-base file
1 A
2 T
3 G
4 C
…
I know we can write a script to do so. But this seems to be a task for a one-liner or some existing tools. Thanks!
A multi-fasta file will be numbered consecutively, correct? OP should keep that in mind.
thanks Pierre, a brilliant use of grep -o and cat -n.