Blastp against (only) human Sars-cov1 : which taxid?
1
0
Entering edit mode
4.5 years ago
guillaume.rbt ★ 1.0k

Hi all,

I'm trying to blast peptides against human Sars-cov1 with blastp.

I'm not quite sure which taxid I should use to keep only results from Sars-cov1. When I search the term "sars" in the organism field, it proposes "HCoV-SARS (taxid:694009)", but this taxid includes a lot of different coronaviruses strains, including the sars-cov2 and bat strains, which I don't want to keep. Is there a reference strain with a taxid for sars-cov1 that I could use?

Any advices would be very useful !

Thanks

blastp • 1.6k views
ADD COMMENT
1
Entering edit mode

I think you have the correct taxid, you can double check on this by going to Taxonomy section of NCBI and putting in this id and it will return the following : https://www.ncbi.nlm.nih.gov/taxonomy/?term=694009. Have you tried doing the following: when you enter organism taxid in blastp settings, click the add button to enter info on another organism and enter the taxid of sars-cov2 in the textarea below and tick the exclude option?

ADD REPLY
0
Entering edit mode

Thank for your help. I've indeed checked this taxid on the taxonomy section of NCBI. It includes sequences from a lot of strains, including Bat coronavirus strains, and I would like to focus on human strains. Even when I exclude sars-cov2 I will still have all the bat strains (if I understand correctly) ?

ADD REPLY
1
Entering edit mode
4.5 years ago
GenoMax 147k

I would say 333387 entry should be the correct one.

RefSeq genome for original SARS virus seems to cross-reference top level taxID that you have in your post so you may just want to use this genome.

ADD COMMENT
0
Entering edit mode

Thank for your input. Is 333387 a bat strain? I would like to blast againt a human sars-cov1 strain.

ADD REPLY
1
Entering edit mode

Then use the RefSeq reference I linked above. Unfortunately it seems to be given a taxID of 694009 which is the generic ID for SARS like corona viruses. As for the bat/human issue the strain originated in bats correct? Human coronaviruses come in many categories.

ADD REPLY
0
Entering edit mode

Yes it seems to come from bats, but if I could I would like to focus on human strains. I'm a bit confused by the fact that the taxid is the generic ID for SARS like coronaviruses, which seems to include bat strains (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=694009). But at the same time, it is described here as human coronavirus. If I blast againts this taxid, will it use the 50,013 protein sequences referenced? And will those sequences include those from bat strains?

ADD REPLY
1
Entering edit mode

If you want to be strict then use these 14 proteins from the RefSeq genome linked above.

If I blast againts this taxid, will it use the 50,013 protein sequences referenced?

It probably will since you are limiting at taxID level.

ADD REPLY
0
Entering edit mode

Ok thanks! FYI I tried to blast with a sequence coming from a bat strain, specifyin the 694009 taxid, and I didn't get a perfect match. The best match being a sequence from the human reference. Which lead me to think that sequences from bat strains are not included in the blast search.

ADD REPLY
0
Entering edit mode

That is curious since the top few entries seem to be from Bats for that taxID.

Severe acute respiratory syndrome-related coronavirus     Click on organism name to get more information.

    Bat coronavirus Cp/Yunnan2011   
    Bat coronavirus RaTG13   
    Bat coronavirus Rp/Shaanxi2011   
    Bat SARS coronavirus HKU3   
        Bat SARS coronavirus HKU3-1   
        Bat SARS coronavirus HKU3-10
ADD REPLY
0
Entering edit mode

I think I understand, I was using refseq and not nr as database, and it didn't include the bat strains sequences.

ADD REPLY

Login before adding your answer.

Traffic: 2768 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6