Databases to blast against when checking for human sequences
1
0
Entering edit mode
5.6 years ago
Michael ▴ 270

I have several sets of relatively short DNA sequences ranging from 200bp to about 2000bp. They are stored as FASTA.

They are all supposed to be from bacterial origin.

However, I want to make sure that there are no human sequences sneaking in. Some of them could also be just partially human (meaning a part of the entire sequence could be from from human origin).

I would just blastN against whole human_genomic.*tar.gz and est_human.*.tar.gz. Speed is not much of an issue, so I do not need a solution like Centrifuge or mapping. I think I would like to go with blast for high sensitivity.

Do you have some more databases you would add to the table to search against?

blast human • 975 views
ADD COMMENT
0
Entering edit mode

Use bbsplit.sh to bin your reads ( A: Tool to separate human and mouse ran seq reads ). Use human genome alone if you don't know specific bacteria you want to include. Reads aligning to human genome will go into one file and rest will be collected in second.

ADD REPLY
0
Entering edit mode

Thank you! However, I am NOT asking for a tool to split my data. I know bbsplit.sh. I am asking if you would add another database in addition to the two I have mentioned above to make sure very short stretches of human sequences get catched.

ADD REPLY
0
Entering edit mode

Human genome sequence should be a catch all. There should be no need to add any other sequence. EST's etc are all a subset of entire genome.

Some of them could also be just partially human (meaning a part of the entire sequence could be from from human origin).

That is a tough criteria. If you want to enforce that then what minimum length are you thinking of using for hits? You may have small stretches of sequence identity between your data and human genome by chance.

ADD REPLY
1
Entering edit mode
5.6 years ago
Michael ▴ 270

Seems like the standard "human genomic" BLAST database is suitable for me.

ADD COMMENT

Login before adding your answer.

Traffic: 2056 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6