Get BUSCO gene descriptions
3
2
Entering edit mode
8.7 years ago
pbigbig ▴ 250

Hi everyone,

I am planning to design primers (to run Sanger sequencing) for assessment of a genome de novo assembly. These primers can be chosen arbitrary, but I prefer to have some meaning of sequenced results, therefore I run BUSCO eukaryote (~400 single-copy orthologs) on the de novo assembly genome. BUSCO run revealed ~60% Complete Single-Copy BUSCOs, but I wonder how could I get to know the name and description of those orthologs in eukaryote set (there are only alignments and numbered code for matches in results)? I really appreciated any help.

Thank you very much in advance!

BUSCO de novo assembly • 6.7k views
ADD COMMENT
0
Entering edit mode

Also very interested in this, have you found an answer?

ADD REPLY
0
Entering edit mode

Sadly not yet, but I could still obtain those ortholog's fasta sequences in BUSCO results and Blast them against Refseq database to get best hit accession ID, then simply refer list of these IDs for descriptive titles (I used Batch Entrez http://www.ncbi.nlm.nih.gov/sites/batchentrez)

ADD REPLY
3
Entering edit mode
5.3 years ago
thackl ★ 3.0k

Just came across the same issue, and came up with a solution. Most BUSCO data sets are generated from OrthoDB. You can query OrthoDB via its API to map BUSCO IDs and pull the information. I've posted a short R snippet to automate this and produce a nice table https://thackl.github.io/BUSCO-gene-descriptions

ADD COMMENT
1
Entering edit mode

Oh great! Thank you very much! Although the post was long time ago but I think it still very useful for other de novo genome project.

ADD REPLY
1
Entering edit mode

Yeah, I was hoping you had moved on by now ;)

ADD REPLY
0
Entering edit mode
6.0 years ago

If you load the FASTA sequence into IGV you can look at the entire genome alongside the genes they code for. From there you can search the name and function of each of these genes.

ADD COMMENT
0
Entering edit mode
2.3 years ago

For others. You can find function information about compete BUSCO's in the output directory generated by BUSCO, file full_table.tsv. To find the function of any BUSCO, navigate to the lineage directory you're using and grep the BUSCO id like the example below...

grep -A 1 <BUSCO id> ancestral

BLAST the resulting sequence. Maybe there's an easier way to find missing BUSCO functions, but this is the only I'm aware of.

ADD COMMENT

Login before adding your answer.

Traffic: 2018 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6