Hey all, so this question is not very specific because I have absolutely no idea about how to go about something like this.
We recently were able to assemble a few almost full length ( > 29kb ) contigs of SARS-CoV-2 from sequence reads originating from human tissue samples. NCBI BLAST using these sequences to the nt/nr database showed which entry in the database each contig was most closely related to and how many misatches were seen between them.
I was wondering that with a new SARS-CoV-2 sequence like ours, what all downstream analysis we can do and the tools/software/web apps required for the same. It would be nice to see where our sequences fit on the coronavirus lineage, their closest related strain and how the base changes ( non synonymous ) can potentially impact protein function etc. There is a HUGE amount of information on the web and it is really hard to understand how to start and go about this.
Any help will be truly appreciated.
Thanks in advance,
If you want to do this locally (if you don't want to upload the sequence) then
Nextstrain
has a tutorial available.Yes! I was going through the analysis and I'll give it a shot. In the meanwhile do you know any web apps where I can quickly upload my FASTA sequence and get some basic stats/figures etc?
Thanks a lot!!
You can try
pangolin
(LINK). There is a web application.A search also brought up
coronapp
(LINK).