Entering edit mode
3.9 years ago
mm2568
•
0
I have file 1 (a FASTA file):
>dmel_X type=golden_path_region; loc=X:2270008..2271068; ID=X; dbxref=GB:AE014298,GB:AE014298,REFSEQ:NC_004354; release=r6.37; species=Dmel;
CTCTTTGTTAGCCTACGCTTTCTGCTGAGTTTGTTATTTTTGTCTGCTCCCCACAAGGATATTGTTACAGAGAAAAAGCT
CGAATTGAAGGGAAAATGGAGACAAATAAGAAAACCCATGACAAAGAGGAAAGTTTCAAATATGGGCAATCGAAAAAATC
GAGAAGTGAGCCAATTTTTTTTTCGCCGAGGCTCCACTGTTCCCAGCTGCATAACTGTTTTCCCTCGGCACCTCTCTTTT
>dmel_3L type=golden_path_region; loc=3L:20341634..20342694; ID=3L; dbxref=GB:AE014296,GB:AE014296,REFSEQ:NT_037436; release=r6.37; species=Dmel;
ATTAGTATATAGGCATATGCTTAAGTCTTAGGGTCTTATGGATATGTCACTATATATATATATAATTGCATAAATAGAGA
TATAATAATAGAGGGAGATAATATATTGAAAGCTTTTAATTGCTTCATACAAATTGATGACATCTCAATATCAAATACAA
TGTTGGATTACACACAAACCGTTTATGTCAATAAGAAAATAACTAAATGGGAAGATCTTTCTATATAAGAATATATAGAG
And I have file 2 (gene names):
CG2918
Spn77Bc
How can I replace the string after the ">" in the FASTA file to have the unique gene names replace the "dmel_.....". The files are obviously longer, but the output should look like:
>CG2918 type=golden_path_region; loc=X:2270008..2271068; ID=X; dbxref=GB:AE014298,GB:AE014298,REFSEQ:NC_004354; release=r6.37; species=Dmel;
CTCTTTGTTAGCCTACGCTTTCTGCTGAGTTTGTTATTTTTGTCTGCTCCCCACAAGGATATTGTTACAGAGAAAAAGCT
CGAATTGAAGGGAAAATGGAGACAAATAAGAAAACCCATGACAAAGAGGAAAGTTTCAAATATGGGCAATCGAAAAAATC
GAGAAGTGAGCCAATTTTTTTTTCGCCGAGGCTCCACTGTTCCCAGCTGCATAACTGTTTTCCCTCGGCACCTCTCTTTT
>Spn77Bc type=golden_path_region; loc=3L:20341634..20342694; ID=3L; dbxref=GB:AE014296,GB:AE014296,REFSEQ:NT_037436; release=r6.37; species=Dmel;
ATTAGTATATAGGCATATGCTTAAGTCTTAGGGTCTTATGGATATGTCACTATATATATATATAATTGCATAAATAGAGA
TATAATAATAGAGGGAGATAATATATTGAAAGCTTTTAATTGCTTCATACAAATTGATGACATCTCAATATCAAATACAA
TGTTGGATTACACACAAACCGTTTATGTCAATAAGAAAATAACTAAATGGGAAGATCTTTCTATATAAGAATATATAGAG
Thank you so much!
Are the gene names in the file the same order as the fasta entry for which it matches?
Yes, the gene names in the file are in the same order as in the FASTA!
What is the relationship between
CG2918
anddmel_X
? Are they simply in order as @rpolicastro asked, or do you have some sort of mapping file?See reply above, they are in the same order - so I was hoping to iterate line by line and replace from the file to the FASTA.