Peak annotation using ChIPseeker (bioconductor package)
3
0
Entering edit mode
8.3 years ago
s.singh ▴ 70

I want to do peak annotation of C.elegans ChIP data using ChIPseeker.

I have used this command peakAnno <- annotatePeak(files, tssRegion=c(-3000, 3000), TxDb=txdb, annoDb="org.Ce.eg.db")

I'm using 'org.Ce.eg.db' from BioConductor for this.

I get the following error:

Error in .testForValidKeys(x, keys, keytype, fks) : None of the keys entered are valid keys for 'ENSEMBL'. Please use the keys method to see a listing of valid arguments.

I would really appreciate if someone could help me resolve this error.

ChIP-Seq • 6.4k views
ADD COMMENT
0
Entering edit mode

What are you TxDb are you using?

ADD REPLY
0
Entering edit mode

I'm using TxDb.Celegans.UCSC.ce6.ensGene

ADD REPLY
0
Entering edit mode

Hmm, sounds like you're using the right datasets. Mind copy and pasting your code so I can take a look at it? Also put 'chipseeker' as a tag so that the author can get a notification about this question. He's pretty active around here.

ADD REPLY
0
Entering edit mode

require(ChIPseeker)

require(TxDb.Celegans.UCSC.ce6.ensGene)

txdb <-TxDb.Celegans.UCSC.ce6.ensGene

require(clusterProfiler)

files = "/home/swadha/Desktop/SFSU/c.elegans_peaks.bed"

peak <- readPeakFile(files)

covplot(peak, weightCol="V5")

promoter <- getPromoters(TxDb=txdb, upstream=3000, downstream=3000)

tagMatrix <- getTagMatrix(peak, windows=promoter)

tagHeatmap(tagMatrix, xlim=c(-3000, 3000), color="red")

peakHeatmap(files, TxDb=txdb, upstream=3000, downstream=3000, color="red")

plotAvgProf(tagMatrix, xlim=c(-3000, 3000), xlab="Genomic Region (5'->3')", ylab = "Read Count Frequency")

peakAnno <- annotatePeak(files, tssRegion=c(-3000, 3000), TxDb=txdb, annoDb="org.Ce.eg.db")

loading peak file... 2016-07-28 07:51:40 PM preparing features information... 2016-07-28 07:51:41 PM identifying nearest features... 2016-07-28 07:51:41 PM calculating distance from peak to TSS... 2016-07-28 07:51:42 PM assigning genomic annotation... 2016-07-28 07:51:42 PM adding gene annotation... 2016-07-28 07:51:43 PM

Error in .testForValidKeys(x, keys, keytype, fks) : None of the keys entered are valid keys for 'ENSEMBL'. Please use the keys method to see a listing of valid arguments.

How to tag someone on biostars??

ADD REPLY
1
Entering edit mode

Try:

peaks <- readPeakFile("/home/swadha/Desktop/SFSU/c.elegans_peaks.bed", as = "GRanges")
peaksAnno <- annotatePeak(peaks, TxDb=txdb, annoDb="org.Ce.eg.db")

Might be useful to include a few lines of your bed file as well so we can take a look at that.

ADD REPLY
0
Entering edit mode

not to tag someone, but the package name.

If you tag your post with chipseeker, I can receive email notification.

Please post a few lines of your bed file as suggested by @Sinji.

ADD REPLY
2
Entering edit mode
8.3 years ago
Guangchuang Yu ★ 2.6k

I download a bed file from GEO and test it. Finally I figure out what's going on here.

This is due to the TxDb.Celegans.UCSC.ce6.ensGene object.

> txdb = TxDb.Celegans.UCSC.ce6.ensGene
> txdb
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: UCSC
# Genome: ce6
# Organism: Caenorhabditis elegans
# Taxonomy ID: 6239
# UCSC Table: ensGene
# Resource URL: http://genome.ucsc.edu/
# Type of Gene ID: Ensembl gene ID
# Full dataset: yes
# miRBase build ID: NA
# transcript_nrow: 35019
# exon_nrow: 146833
# cds_nrow: 127881
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2015-10-07 18:15:25 +0000 (Wed, 07 Oct 2015)
# GenomicFeatures version at creation time: 1.21.30
# RSQLite version at creation time: 1.0.0
# DBSCHEMAVERSION: 1.1
> txidinfo <- transcripts(txdb, columns = c("tx_id", "tx_name", "gene_id"))
> head(txidinfo)
GRanges object with 6 ranges and 3 metadata columns:
      seqnames         ranges strand |     tx_id     tx_name         gene_id
         <Rle>      <IRanges>  <Rle> | <integer> <character> <CharacterList>
  [1]     chrI [11495, 16793]      + |         1  Y74C9A.2.2        Y74C9A.2
  [2]     chrI [11499, 16790]      + |         2  Y74C9A.2.3        Y74C9A.2
  [3]     chrI [11499, 16831]      + |         3  Y74C9A.2.1        Y74C9A.2
  [4]     chrI [11505, 16790]      + |         4  Y74C9A.2.4        Y74C9A.2
  [5]     chrI [11618, 16804]      + |         5  Y74C9A.2.5        Y74C9A.2
  [6]     chrI [43733, 44677]      + |         6    Y74C9A.1        Y74C9A.1
  -------
  seqinfo: 7 sequences (1 circular) from ce6 genome
>

Although the ID type recorded in the txdb metadata is ensemble, Type of Gene ID: Ensembl gene ID (also indicated by the ensGene part of the object name), it actually doesn't have gene ID available in the object.

As indicated in the above code, gene_id is actually transcript name but not ensembl gene ID.

The last step which throw error:

>> adding gene annotation... 2016-07-28 07:51:43 PM

Error in .testForValidKeys(x, keys, keytype, fks) : None of the keys entered are valid keys for 'ENSEMBL'. Please use the keys method to see a listing of valid arguments.

is trying to convert ENSEMBL ID toEntrezgene ID and gene Symbol using annoDb="org.Ce.eg.db" which of course will be failed since the ID is not ensemble but tx_name.

This issue actually had been fixed and you are using out-dated version of ChIPseeker.

Please follow my post when you got an issue.

In the latest release version of ChIPseeker, when annotatePeak fail to convert gene ID, it will throw warning and skip this step.

ADD COMMENT
0
Entering edit mode
8.3 years ago
s.singh ▴ 70

Hi,

The bed files were generated from the tool macs2 using following command: macs2 callpeak -t HTZ.S36A.Ex39.Br.map.ce10.gz -n HTZ.S36A. -g ce --bdg --keep-dup=auto --broad --broad-cutoff=0.01 --nomodel)

Here is the link of the screenshot of my bed file.

Thanks for replying to my query so actively. I upgraded ChIPseeker (version:1.8.9), but the same error is occurring. How can I overcome this error??

ADD COMMENT
0
Entering edit mode

just use:

peaksAnno <- annotatePeak(peaks, TxDb=txdb)
ADD REPLY
0
Entering edit mode
8.3 years ago
s.singh ▴ 70

That worked perfectly fine!! Thanks a ton Sinji and Guangchuang :)

ADD COMMENT
2
Entering edit mode
> peakAnno <- annotatePeak(peak, tssRegion=c(-3000, 3000), TxDb=txdb, annoDb="org.Ce.eg.db")
>> preparing features information...         2016-08-02 15:20:33
>> identifying nearest features...       2016-08-02 15:20:34
>> calculating distance from peak to TSS...  2016-08-02 15:20:38
>> assigning genomic annotation...       2016-08-02 15:20:38
>> adding gene annotation...             2016-08-02 15:20:45
Loading required package: org.Ce.eg.db

>> assigning chromosome lengths          2016-08-02 15:20:45
>> done...                   2016-08-02 15:20:45
Warning message:
In getGeneAnno(annoDb, peak.gr$geneId, type) :
  ID type not matched, gene annotation will not be added...

In ChIPseeker (version >=1.9.5), it works fine with this case.

ADD REPLY
0
Entering edit mode

Gene annotation was skipped in this case. How would you solve this problem?

ADD REPLY

Login before adding your answer.

Traffic: 2217 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6