I made a script to retrieve all of the recorded full-length human lung infecting coronaviruses in genbank (virus variation database, ncbi). After trimming out all of the really small files or those that didnt work, I'm left with ~200 fasta files to work with. What are some analyses I can run with the new COVID-19 genome? I made a blast database and ran some blastn queries but I'm wondering what bioinformatic analyses are typically run on novel viruses?
My blastn query of word size 7 had a few hits but I am wondering how to interpret these results and what to do with that data.
This is just for fun/educational. I don't have much experience with viruses or comparative genomics.
Thanks
Thank you very much! I will report back with my findings (if I can figure it out :)).
We have some bioinformatics resources related to SARS-CoV-2 (Also an MSA with MUSCLE, as suggested above). You can check them out. Maybe you find something of interest: https://genexa.ch/sars2-bioinformatics-resources/