s pombe non-coding regions annotation
2
1
Entering edit mode
7.7 years ago
Lila M ★ 1.3k

Hi guys,

I've just wondering if someone here know how to get the S.pombe annotation over the non-coding regions. As I am using chipseeker for the annotation:

txdb_pombe <- makeTxDbFromGFF("Schizosaccharomyces_pombe.ASM294v2.34.gff3.gz", format="gff3", organism = "Schizosaccharomyces pombe", taxonomyId = "4896")non-coding regions

fileList <- list.files( ,pattern = "*.BED")
for (i in fileList){
  print(i)
  peak <- readPeakFile(i, header=F)
  print(peak)
  peakAnno <- annotatePeak(peak, tssRegion=c(-2000, 2000), TxDb = txdb_pombe)
  print(peakAnno)
  write.table(peakAnno, paste("annot_total_", i, sep="") , sep="\t", col.names=T, row.names = F, quote=F)
}

As my gff3 genome only have the coding regions for S.pombre, I can't get the intron annotation. I'd would like to know if anyone know how I can get the whole annotation for S.pombe including also the non-coding regions

Thanks!

ChIP-Seq annotation genome ChIPseeker non-coding • 2.4k views
ADD COMMENT
2
Entering edit mode
7.7 years ago

Hi !

Reference annotation of S.pombe is on pombase. Go into downloads > datasets > genome-datasets > feature coordinates (or just click here).

Note that there is one .gff3 file by chromosome, you might need to merge them.

Introns are a type of "biological regions" so you'll need to look for those lines :

chrom   source  feature start   end a   strand  b   c
I   .   biological_region   180528  180585  1   +   .   assembly_name=ASM294v2;external_name=SPAC13G6.04.1:intron:1;logic_name=intron
ADD COMMENT
0
Entering edit mode

Hi! I've used the last file of the link, in that file are not included the non-coding region?

Thank you!

ADD REPLY
2
Entering edit mode

introns , 5' and 3' UTRs, ncRNAs, promoters ... they are all annotated in that file. Just search "intron" and you will see :)

ADD REPLY
0
Entering edit mode

Yes I know, but when I use chipseeker, it only report promoter, exon.... and not introns, downstream, distal intergenic.... any idea whats going on? :/

ADD REPLY
1
Entering edit mode

So the annotation is fine, the issue now is in how ChIPseeker reads it. Sadly I can't really help here because I don't use this package. Perhaps you should ask a new question or edit the title of this one.

PS : A wild guess : ChIPseeker maybe can't parse "biological_region" features. But if you change this

chrom   source  feature start   end a   strand  b   c
I   .   biological_region   180528  180585  1   +   .   assembly_name=ASM294v2;external_name=SPAC13G6.04.1:intron:1;logic_name=intron

into this :

chrom   source  feature start   end a   strand  b   c
I   .   intron   180528  180585  1   +   .   assembly_name=ASM294v2;external_name=SPAC13G6.04.1:intron:1;logic_name=intron

it might work. or not.

ADD REPLY
2
Entering edit mode
7.7 years ago
Guangchuang Yu ★ 2.6k
> require(GenomicFeatures)
> x <- makeTxDbFromGFF("schizosaccharomyces_pombe.chr.gff3")
> sapply(intronsByTranscript(x)[1:10], length)
 1  2  3  4  5  6  7  8  9 10 
 0  0  0  0  1  0  1  1  0  0 
> intronsByTranscript(x)[5]
GRangesList object of length 1:
$5 
GRanges object with 1 range and 0 metadata columns:
      seqnames         ranges strand
         <Rle>      <IRanges>  <Rle>
  [1]        I [18307, 18348]      +

Indeed makeTxDbFromGFF can parse intron from this GFF file and ChIPseeker use this information to annotate your peak.

ADD COMMENT
1
Entering edit mode

Thank you, it works but... How can I do the annotation using the non-coding region? I mean, in my output doesn't appear the non coding region as for example in human. I think that I'm missing something...

ADD REPLY

Login before adding your answer.

Traffic: 2079 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6