Entering edit mode
7.8 years ago
beacamara
•
0
Hi everybody! I am pretty new in the bioinformatic field. I hope you can help me in this matter.
I am working with a fasta file with the following header format
>r2044088.2 |SOURCES={KEY=a0ea3476...,fw,2795082-2795162}|ERRORS={8:G,40:C}|SOURCE_1="FN869568.1 Halomonas elongata DSM 2581, complete genome" (a0ea3476c9d169a3045db6ff30c679db40e694f8)
TGCGGTCAGCGTCAAGTCGAGCAGCACGCCGCGCAGCACCCCATCCTTGTCGTCCAGCGACAGGT
>r2044089.2 |SOURCES={KEY=a0ea3476...,fw,2675676-2675756}|ERRORS={}|SOURCE_1="FN869568.1 Halomonas elongata DSM 2581, complete genome" (a0ea3476c9d169a3045db6ff30c679db40e694f8)
TACTTGGGAAGCGCTGGGAGCCAATGCAACCCCCATGGCATGGACTGAAGTCTACACCGCCCTCC
What I would like to do is to change slightly the header of each read entry in the following way using a python script:
>r2044088.1 |SOURCES={KEY=a0ea3476...,fw,2795082-2795162}|ERRORS={8:G,40:C}|SOURCE_1="FN869568.1 Halomonas elongata DSM 2581, complete genome" (a0ea3476c9d169a3045db6ff30c679db40e694f8)
TGCGGTCAGCGTCAAGTCGAGCAGCACGCCGCGCAGCACCCCATCCTTGTCGTCCAGCGACAGGT
>r2044089.1 |SOURCES={KEY=a0ea3476...,fw,2675676-2675756}|ERRORS={}|SOURCE_1="FN869568.1 Halomonas elongata DSM 2581, complete genome" (a0ea3476c9d169a3045db6ff30c679db40e694f8)
TACTTGGGAAGCGCTGGGAGCCAATGCAACCCCCATGGCATGGACTGAAGTCTACACCGCCCTCC
Any idea?
Thanks a lot in advance
Bea
Thank you so much shenwei356 for your quick reply. I tried to paste the read entries as they appear in my fasta file but I do not it did not show the symbol ">". That's why I introduced " at the beginning and at the end of the read entries.
As you have mentioned, what I am trying to do is to change the version number. Thank you very much. I will try with your suggestion
I've edited your post using code formatting to better display the fasta, with the
101010
button.