Hi guys, it's quite urgent and I really hope someone can help me out by explaining this to me TT. If we want to use human coronaviruses to do some bioinformatics analysis, it will be the best to retrieve sequences/genomes from human host? Or it doesn't matter?
This is because I got the deposited data file of all human coronaviruses sequences from a published article from PNAS, and I realised small number of the sequences retrieved are not from human host or they are from unknown host. I am confused whether human coronaviruses genomes/sequences must be from human host or other animal host should be okay too. Can any experts/professionals or anyone who are familiar with this, explain to me about this matter?
By the way, when I'm downloading the sequences and sequences information from NCBI database, I managed to check the host of that particular sequence/genome. That's why i found out there are human coronavirus sequences from animal host.
p/s: Sorry if the question sounds a bit stupid, but I'm very new in this thing :'D
Thank you in advanced for all the explanations and I really appreciate it :)
It depends what you want to do. If you think the host of origin can be a factor in the interpretation of your analysis then only analyze sequences from the same host. If not it probably doesn't matter. Since you're not giving specific information we can't be more specific.
There are many different coronaviruses including some for which humans can be considered the normal hosts (HCoV-229E, HCoV-OC43, HCoV-NL63 and HCoV-HKU1, they cause a form of the common cold). Also sometimes human viruses are studied in animal models. A sequence reported from such an experiment may list the animal model as the host.
Oh I see, I have understood. Thank you for the explanations, it helps me a lot! :)