I logged in to the GISAID website, and found the sequences data there are as the following: it has four files, i.e., readme FASTA header format, allprot0104(57MB), spikeprot0104(6MB), nextregions. Among these, readme FASTA header format, allprot0104(57MB) and spikeprot0104(6MB) are under the Alignment and proteins menu and the file nextregions is under the menu Genomic epidemiology. As I recalled earlier, the sequences data in GISAID is as the following. It has six files, with names msa_0505(132MB), allprot0506(3MB), spikeprot0506(455KB), nextmeta(155KB), nextfasta(13MB), and nextregions. Among these, the first three files are under the menu Alignment and proteins, and the last three files are under the menu Genomic epidemiology. Could anyone please explain to me what are the data updates ?
@2001linana, you have asked many questions in quick succession which, connecting the dots, appear to be in some way related to COVID19 phylogenomics.
I think you would be better served if you asked one detailed question, which lays out exactly what you are trying to achieve, rather than us reading between the lines in many separate and half-complete questions. It seems you are trying everything under the sun to parse files, download genomes etc, and I worry that you are perhaps stumbling around in the dark and maybe giving up on the right approach when you encounter challenges.
Did you look at the
readme
files that seem to be present? Generally they should have the information you are looking for based on the names of the files you posted above (I don't have a GISAID account to check).