Biomart Mrna Count Much Bigger When Pulling By Xml Query Than Using Webpage
1
0
Entering edit mode
12.7 years ago
Tom • 0

Hi All,

I downloaded mRNA data for human

<Query virtualSchemaName="default" formatter="FASTA" header="0" uniqueRows="1" count="" datasetConfigVersion="0.8">

<Dataset name = "hsapiens_gene_ensembl" interface = "default">
    <Filter name = "status" value = "KNOWN"/>
    <Filter name = "transcript_status" value = "KNOWN"/>
    <Filter name = "biotype" value = "protein_coding"/>
    <Attribute name = "ensembl_transcript_id"/>
    <Attribute name = "cdna"/>
</Dataset>

</Query>

When I check it on biomart.org the count shows 20467 but the file I am getting is huge and the count goes over 100l+. I have tried playing with datasetConfigVersion = "0.8" setting it to 0.6, 0.7 and 0.8 and always the same. Why am I getting so many sequences with sml query? Wven when I do not use status transcript status and biotype only the total number of genes with cDNA sequence is about 50k. Also I keep getting MySQL server errors with lost connection error. Busy server? Thanks.

Tom

biomart mrna human ensembl sequence • 2.5k views
ADD COMMENT
1
Entering edit mode
12.7 years ago
Bert Overduin ★ 3.7k

Tom, the [Count] button in BioMart gives you the number of filtered genes, but you are exporting transcripts. Many genes have multiple transcripts annotated, hence the discrepancy in numbers you are observing.

ADD COMMENT

Login before adding your answer.

Traffic: 1682 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6