how to change scaffold number
3
0
Entering edit mode
16 months ago
chimerajit • 0

I used seqkit to replace my multifasta files as below

"seqkit replace -p .+ -r "scaffold_{nr}" input.fa -o rename.fa"

This is giving me output file like this below

>scaffold_1 
ATCGTCGATACGCGA 
>scaffold_2 
GCGTACGATAC 
>scaffolt_3
ACTATCTACTTCA 

etc...

how can I change the scaffold numbering 1 to 0001

seqkit FASTA • 961 views
ADD COMMENT
0
Entering edit mode

Thanks all for your kind help.

ADD REPLY
5
Entering edit mode
16 months ago
GenoMax 148k

You can also do:

seqkit replace -p .+ -r "scaffold_{nr}" --nr-width 4 input.fa -o rename.fa
ADD COMMENT
3
Entering edit mode
16 months ago

pipe into:

awk -F '_' '/^>/ {printf("%s_%04d\n",$1,$2);next;} {print}'

biostars want some text

ADD COMMENT
0
Entering edit mode
16 months ago
bk11 ★ 3.0k

If you would like to use sed, you can do sth like this-

cat output.txt
>scaffold_1 
ATCGTCGATACGCGA 
>scaffold_2 
GCGTACGATAC
>scaffolt_3
ACTATCTACTTCA

cat output.txt | sed 's/>*_\([0-9]\)/_000\1/' 
>scaffold_0001
ATCGTCGATACGCGA 
>scaffold_0002 
GCGTACGATAC
>scaffolt_0003
ACTATCTACTTCA 
ADD COMMENT
0
Entering edit mode

it just adds '000' I don't think this is what SO wants (he wants '0100' not '000100' )

ADD REPLY

Login before adding your answer.

Traffic: 2109 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6