Cnv Annotation Tools
2
8
Entering edit mode
10.6 years ago

Given a set of regions from a CNV analysis, what are the tools folks are using to annotate them, with particular focus on genes, disease associations, and known CNVs? Google was particularly unhelpful, but that may be my limitations with the tool. CNVannotator is prominent, but it is web-only and has a limit of 500 region batches. Any other suggestions?

cnv • 9.5k views
ADD COMMENT
0
Entering edit mode

CNV annotation (with OMIM, DGV, 1000g, haploinsufficiency, TAD, ... and also with your own in-house information) can be easily automated !

You can look at this post describing the annotSV tool: Annotation for SV and CNV

ADD REPLY
6
Entering edit mode
10.6 years ago
Ryan D ★ 3.4k

We analyze a lot of CNVs both called using tools like Birdsuite or PennCNV or those which are imputed from GWAS using reference panels like that from 1000 Genomes. In terms of tools, I can recommend two steps that we do: the first covers overlap with genes, using Bedtools; the second covers disease associations, using gene2mesh .

From a BED file of CNVs, such as this:

chr22    39378403    39388216    esv2666691    MERGED_DEL_2_106009
chr5    151514804    151518864    esv2666686    MERGED_DEL_2_32905

You take a bed file including all genes in the genome or a subset. You may also use individual exons of genes (contact me for such a file.):

    chr1    11873   12227   DDX11L1
    chr1    12612   12721   DDX11L1
    chr1    13220   14409   DDX11L1
    chr1    14361   14829   WASH7P
    chr1    14969   15038   WASH7P
    chr1    15795   15947   WASH7P
...

Using Bedtools, run the intersection like so:

intersectBed -a cnvs.bed -b refseq_exons.bed -wb

This will give you output like so:

chr22   39387563     39388216   esv2666691    MERGED_DEL_2_106009     chr22   39387563        39394225        APOBEC3B-AS1
chr22   39358280     39388216   esv2666691    MERGED_DEL_2_106009     chr22   39353526        39388783        APOBEC3A_B
chr22   39358280     39359188   esv2666691    MERGED_DEL_2_106009     chr22   39353526        39359188        APOBEC3A
chr22   39378403     39388216   esv2666691    MERGED_DEL_2_106009     chr22   39378403        39388784        APOBEC3B
chr5    151514804    151518864  esv2666686    MERGED_DEL_2_32905      chr5    151338458       151650010       CTB-12O2.1

For your second step on finding links with disease, I have found gene2mesh to be very helpful. It gives links to keywords, but there are some other useful resources including OMIM that may be helpful. In the case of gene2mesh, the following perl script can take the output genes and get the top MeSH terms. In this case, we put the genes from the intersectBed output above into a file called "gene_list.txt". Perl script we modified from the website looks like so:

#!/usr/bin/perl -w
use strict;
use warnings;
use XML::XPath;
use XML::XPath::XMLParser;
use LWP::UserAgent qw($ua get);
my $ua = new LWP::UserAgent;
my $file="gene_list.txt";
open(F,$file);
while(<F>){
    ( my $gene)=split;
    my $getf="http://gene2mesh.ncibi.org/fetch?genesymbol=${gene}&limit=5";
    my $response = $ua->get($getf);
    my $xp = XML::XPath->new(xml => $response->content);
    print "## Top 30 MeSH Terms from Gene2MeSH Associated with GeneID $gene  ##\n\n";
   foreach my $g2mNode ($xp->find('//Descriptor/Name')->get_nodelist) {
    print $g2mNode->string_value . "\n";
    }
}
close(F);

And running it on a short list of genes would give:

## Top 30 MeSH Terms from Gene2MeSH Associated with GeneID APOBEC3B  ##

Cytidine Deaminase
vif Gene Products, Human Immunodeficiency Virus
Gene Products, vif
HIV-1
HIV Infections
ADD COMMENT
1
Entering edit mode

I would add for the "known CNVs" that we use those as discovered by 1000 Genomes but other files such as the DGV can also be used.

ADD REPLY
3
Entering edit mode
10.6 years ago
Min ▴ 90

Hi, guys. I am the author of the CNVannotator, I provide the data download in the page (http://bioinfo.mc.vanderbilt.edu/CNVannotator/download_datafile.cgi). It is easy to annotator CNVs as you like.

ADD COMMENT
0
Entering edit mode

Any chances that the resources used for annotation will be updated in the future?

ADD REPLY

Login before adding your answer.

Traffic: 1988 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6