Entering edit mode
9.0 years ago
waqasnayab
▴
250
Hi,
I have a contig file:
>NODE_1_length_248_cov_3.157258
AAGGACTTGAGGGGCCTAACCTACCCTCAAGCATGCTCCCCGAAAGATTCCATCCATCCT
AGTCTTTTGAGGACAAATCCTACTGTGTAGACGAGTCATAGGGCAGACATTCGCGACGAA
TGGATCCGCCGGCCTCATCAGATAATTGAGACCGTCAACTGCCAGGTGCTCAAGAGGTTC
CTGGTTAAGTCTCCCTAGGCGTGGGAACTCTTTATGCATCGTTAACGTCCATCGGCTGAG
TGCCCACAGCGTTACTCAAGGCAGATTATACTGGGgag
>NODE_2_length_89_cov_4.494382
GTCGATAGATCTATGTGTTTAGACATGTAGATCAGTGGTCGTTGTGATGAGCGTAGCGCT
TGCGGAACGTGCACGAGTATACTATCACCGCCGGATTTTAATGCAGAGAGGTTCCCGAg
>NODE_3_length_79_cov_3.227848
and so on ........
I need to change the header in the following way:
>Contig1.1
AAGGACTTGAGGGGCCTAACCTACCCTCAAGCATGCTCCCCGAAAGATTCCATCCATCCT
AGTCTTTTGAGGACAAATCCTACTGTGTAGACGAGTCATAGGGCAGACATTCGCGACGAA
TGGATCCGCCGGCCTCATCAGATAATTGAGACCGTCAACTGCCAGGTGCTCAAGAGGTTC
CTGGTTAAGTCTCCCTAGGCGTGGGAACTCTTTATGCATCGTTAACGTCCATCGGCTGAG
TGCCCACAGCGTTACTCAAGGCAGATTATACTGGGgag
>Contig1.2
GTCGATAGATCTATGTGTTTAGACATGTAGATCAGTGGTCGTTGTGATGAGCGTAGCGCT
TGCGGAACGTGCACGAGTATACTATCACCGCCGGATTTTAATGCAGAGAGGTTCCCGAg
>Contig1.3
and so on........
I tried this awk command:
cat contig_1.fa | awk '{print (NR%4 == 1) ? ">Contig1." ++i : $0}' > contig_1_rename.fa
the output is:
contig_1_rename.fa
>Contig1.1
AAGGACTTGAGGGGCCTAACCTACCCTCAAGCATGCTCCCCGAAAGATTCCATCCATCCT
AGTCTTTTGAGGACAAATCCTACTGTGTAGACGAGTCATAGGGCAGACATTCGCGACGAA
TGGATCCGCCGGCCTCATCAGATAATTGAGACCGTCAACTGCCAGGTGCTCAAGAGGTTC
>Contig1.2
TGCCCACAGCGTTACTCAAGGCAGATTATACTGGGgag
>NODE_2_length_89_cov_4.494382
GTCGATAGATCTATGTGTTTAGACATGTAGATCAGTGGTCGTTGTGATGAGCGTAGCGCT
>Contig1.3
seems to me inserting header after every four lines instead of replacing the header. how to give a pattern search and replace in awk command rather than mentioning line (NR)?
Thanks,
Waqas.
Thanks, its great, worked fine,