How To Calculate Conservation Score Of Given Bed Regions
3
0
Entering edit mode
13.0 years ago
User 6762 • 0

Is there any simple way of extracting 44way or 29 mammals conservation levels of given genomic regions in human genome (BED format) ?

conservation • 7.0k views
ADD COMMENT
4
Entering edit mode
13.0 years ago

44way conservation levels are available as a set of WIG/BIGWIG files at the UCSC. For example, see: ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/phastCons46way/primates/

you can query and convert those wig/bigwig files using wigtobigwig and bigWigToBedGraph in http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/

================================================================
========   bigWigToBedGraph   ====================================
================================================================
bigWigToBedGraph - Convert from bigWig to bedGraph format.
usage:
   bigWigToBedGraph in.bigWig out.bedGraph
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
ADD COMMENT
0
Entering edit mode

29 mammals is not there. you can find it here, already in bed format. http://www.broadinstitute.org/scientific-community/science/projects/mammals-models/29-mammals-project-supplementary-info

extraction of regions can be done with bedtools i think?

ADD REPLY
0
Entering edit mode

i removed "29 mammals"

ADD REPLY
0
Entering edit mode

I used these steps to get my phastcons scores -

  1. Get the bw file from UCSC
  2. Convert bigWigToBedGraph and bedGraph to bed (awk '{print $1 "\t" $2 "\t" $3 "\t" $4}')
  3. bedtools intersect -a ChIP_peaks.bed -b phastCons.bed

Is this the best way to do this? My final goal is to plot a PhastCons score vs Distance from Binding site plot. Please advice.

P.S: I am working with an insect genome, most of the available tools seem to be for human genomes.

ADD REPLY
1
Entering edit mode
13.0 years ago
Pascal ▴ 160

In the title you ask how to calculate the conservation score, and in the description just how to extract conservation score from multiple regions.

For extracting you might want to take a look at biopieces. It includes a program for exactly that purpose: http://code.google.com/p/biopieces/wiki/get_genome_phastcons

Of course, the genome and conservation scores have to be downloaded from UCSC and set up for biopieces first.

ADD COMMENT
0
Entering edit mode
12.9 years ago
User 6762 • 0

I found an easy way, using CompleteMOTIFS tool. It gives average PhastCons scores of a given BED region. http://cmotifs.tchlab.org/cgi/pipeline_v3.cgi

ADD COMMENT

Login before adding your answer.

Traffic: 2124 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6