How do I get ensembl genome sizes with mysql?
1
0
Entering edit mode
5.7 years ago
endrebak ▴ 980

To get genome sizes from UCSC I just use the following code:

import MySQLdb

def ucsc(genome, query):

    host = "genome-mysql.cse.ucsc.edu"
    user = "genome"
    db = genome

    conn = MySQLdb.Connection(host=host, user=user, db=db)

    df = pd.read_sql(query, conn)

    conn.close()

    return df


def chromosome_lengths(genome):

    query = 'select chrom,size from chromInfo'

    df = ucsc(genome, query)

    return pd.Series(df.size, index=df.chrom).to_dict()

But how do I get genome sizes from ensembl through mysql? For example for GRCz11 and GRCm38.p6.

Just giving me the mysql command line command is enough :)

ensembl • 1.5k views
ADD COMMENT
2
Entering edit mode
5.7 years ago
Emily 24k

The data is stored in the genome statistics table.

/usr/local/mysql/bin/mysql -h ensembldb.ensembl.org -u anonymous -P 5306
use danio_rerio_core_95_11;
select * from genome_statistics where statistic = "ref_length";
ADD COMMENT
0
Entering edit mode

Do you also know how to get chromosome sizes? :)

ADD REPLY
1
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 1877 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6