Hi,
I have a bedgraph which contain chromosome start and end site.
How can I get the annotation for the chromosome location,such as gene name?
Thanks
Hi,
I have a bedgraph which contain chromosome start and end site.
How can I get the annotation for the chromosome location,such as gene name?
Thanks
First, you might download a set of genes, using gtf2bed to make a BED file, e.g. for human:
$ wget -qO- ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_21/gencode.v21.annotation.gtf.gz \
| gunzip -c - \
| gtf2bed - \
| grep -w gene - \
> gencode.v21.genes.bed
Basically, you want a set of genomic ranges that have gene names associated with them, and you want them in BED format. The above is just one way to do this.
Second, use bedmap to map the elements in elements.bedgraph
to the gene names in gencode.v21.genes.bed
, e.g.:
$ bedmap --echo --echo-map-id-uniq --delim '\t' elements.bedgraph gencode.v21.genes.bed > answer.bed
The file answer.bed
will have your bedgraph ranges, along with a list of gene names for genes that overlap the bedgraph ranges by one or more bases. If you want more than just the gene ID or name, then take a look at bedmap's --echo-map-*
options.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
which organism/build ?
Try to use Biomart from Ensembl with R http://master.bioconductor.org/packages/release/bioc/html/biomaRt.html
You can find all the information about gene annotation (i.e. coordinate, direction of the transcription etc..)