Dear all, I have two files, one file is from the GFF file with genome annotation in it, as
"
NW_004848299.1 RefSeq region 1 2133925 . + . ID=id0;Name=Unknown;Dbxref=taxon:
NW_004848299.1 Gnomon gene 255845 257824 . - . ID=gene0;Name=LOC101930845;Dbxref
NW_004848299.1 Gnomon mRNA 255845 257824 . - . ID=rna0;Name=XM_005278412.1 .... "
for my second file as the scaffold information like this:
" Assembly Genome_name RefSeq_Accession GenBank_Accession NCBI_name
Chrysemys_picta_bellii-3.0.1 Group1 NW_004848299.1 JH584390.1 GPS_001879038.1
Chrysemys_picta_bellii-3.0.1 Group2 NW_004848300.1 JH584391.1 GPS_001879039.1
Chrysemys_picta_bellii-3.0.1 Group3 NW_004848301.1 JH584392.1 GPS_001879040.1
.. ..
"
the name of first file is the scaffold name. the file of second is containing the scaffold information but with different names.
in my first GFF file, the first column name is corresponding to the third column name of second file.
but I want to replace the first column name from first gff file as the name from second column name from second file.
the result would be as
"
Group1 RefSeq region 1 2133925 . + . ID=id0;Name=Unknown;Dbxref=taxon:84
Group1 Gnomon gene 255845 257824 . - . ID=gene0;Name=LOC101930845;Dbxref
Group1 Gnomon mRNA 255845 257824 . - . ID=rna0;Name=XM_005278412.1
.... "
How can I do it by using R or unix command or perl script. the files all separated as tabs.
thanks
ZQ