Question

s pombe non-coding regions annotation

1

Entering edit mode

8.3 years ago

Lila M ★ 1.3k

Hi guys,

I've just wondering if someone here know how to get the S.pombe annotation over the non-coding regions. As I am using chipseeker for the annotation:

txdb_pombe <- makeTxDbFromGFF("Schizosaccharomyces_pombe.ASM294v2.34.gff3.gz", format="gff3", organism = "Schizosaccharomyces pombe", taxonomyId = "4896")non-coding regions

fileList <- list.files( ,pattern = "*.BED")
for (i in fileList){
  print(i)
  peak <- readPeakFile(i, header=F)
  print(peak)
  peakAnno <- annotatePeak(peak, tssRegion=c(-2000, 2000), TxDb = txdb_pombe)
  print(peakAnno)
  write.table(peakAnno, paste("annot_total_", i, sep="") , sep="\t", col.names=T, row.names = F, quote=F)
}

As my gff3 genome only have the coding regions for S.pombre, I can't get the intron annotation. I'd would like to know if anyone know how I can get the whole annotation for S.pombe including also the non-coding regions

Thanks!

ChIP-Seq annotation genome ChIPseeker non-coding • 2.7k views

ADD COMMENT • link updated 8.3 years ago by Guangchuang Yu ★ 2.6k • written 8.3 years ago by Lila M ★ 1.3k

score 2 · Answer 1 · 2017-03-08

2

Entering edit mode

8.3 years ago

Carlo Yague 9.0k

Hi !

Reference annotation of S.pombe is on pombase. Go into downloads > datasets > genome-datasets > feature coordinates (or just click here).

Note that there is one .gff3 file by chromosome, you might need to merge them.

Introns are a type of "biological regions" so you'll need to look for those lines :

chrom   source  feature start   end a   strand  b   c
I   .   biological_region   180528  180585  1   +   .   assembly_name=ASM294v2;external_name=SPAC13G6.04.1:intron:1;logic_name=intron

ADD COMMENT • link 8.3 years ago by Carlo Yague 9.0k

0

Entering edit mode

Hi! I've used the last file of the link, in that file are not included the non-coding region?

Thank you!

ADD REPLY • link 8.3 years ago by Lila M ★ 1.3k

2

Entering edit mode

introns , 5' and 3' UTRs, ncRNAs, promoters ... they are all annotated in that file. Just search "intron" and you will see :)

ADD REPLY • link 8.3 years ago by Carlo Yague 9.0k

0

Entering edit mode

Yes I know, but when I use chipseeker, it only report promoter, exon.... and not introns, downstream, distal intergenic.... any idea whats going on? :/

ADD REPLY • link 8.3 years ago by Lila M ★ 1.3k

1

Entering edit mode

So the annotation is fine, the issue now is in how ChIPseeker reads it. Sadly I can't really help here because I don't use this package. Perhaps you should ask a new question or edit the title of this one.

PS : A wild guess : ChIPseeker maybe can't parse "biological_region" features. But if you change this

chrom   source  feature start   end a   strand  b   c
I   .   biological_region   180528  180585  1   +   .   assembly_name=ASM294v2;external_name=SPAC13G6.04.1:intron:1;logic_name=intron

into this :

chrom   source  feature start   end a   strand  b   c
I   .   intron   180528  180585  1   +   .   assembly_name=ASM294v2;external_name=SPAC13G6.04.1:intron:1;logic_name=intron

it might work. or not.

ADD REPLY • link 8.3 years ago by Carlo Yague 9.0k

score 2 · Answer 2 · 2017-03-08

2

Entering edit mode

8.3 years ago

Guangchuang Yu ★ 2.6k

> require(GenomicFeatures)
> x <- makeTxDbFromGFF("schizosaccharomyces_pombe.chr.gff3")
> sapply(intronsByTranscript(x)[1:10], length)
 1  2  3  4  5  6  7  8  9 10 
 0  0  0  0  1  0  1  1  0  0 
> intronsByTranscript(x)[5]
GRangesList object of length 1:
$5 
GRanges object with 1 range and 0 metadata columns:
      seqnames         ranges strand
         <Rle>      <IRanges>  <Rle>
  [1]        I [18307, 18348]      +

Indeed makeTxDbFromGFF can parse intron from this GFF file and ChIPseeker use this information to annotate your peak.

ADD COMMENT • link 8.3 years ago by Guangchuang Yu ★ 2.6k

1

Entering edit mode

Thank you, it works but... How can I do the annotation using the non-coding region? I mean, in my output doesn't appear the non coding region as for example in human. I think that I'm missing something...

ADD REPLY • link 8.3 years ago by Lila M ★ 1.3k