Question

Annotating bed files with Crosslinking induced truncations sites from iCLIP experiment (chromosome locations).

0

Entering edit mode

7.0 years ago

CrisMar ▴ 80

Hello, I have a bed file listing chromosome regions corresponding to CITS (crosslink induced truncation sites), thus one nucleotide listed below. These sites are from an iCLIP experiment to identify binding sites of a specific RNA-binding protein.

$head CITS.bed

chr1    568974  568975  CITS_1[gene=chr1_f_c24][PH=12][PH0=0.29][P=1.01e-12]          12    +
chr1    2239149 2239150 CITS_2[gene=chr1_f_c1136][PH=7][PH0=0.40][P=2.21e-04]   7   +
chr1    2239899 2239900 CITS_3[gene=chr1_f_c1138][PH=6][PH0=0.21][P=3.56e-04]   6   +
chr1    2461199 2461200 CITS_4[gene=chr1_f_c1237][PH=5][PH0=0.17][P=1.46e-04]   5   +
chr1    6346493 6346494 CITS_5[gene=chr1_f_c1541][PH=18][PH0=1.19][P=3.68e-13]  18  +
chr1    8409692 8409693 CITS_6[gene=chr1_f_c2222][PH=6][PH0=0.21][P=1.45e-05]   6   +

I want to add a few more columns and annotate each nucleotide (i.e. transcript name, transcript type, feature (e.g. exon, 3'UTR, 5'UTR).

I've tried HOMER annotatePeaks.pl but this yields annotations near TSS which is not what I need (since it's not ChIP-seq data).

I've also tried bedtools intersect using the gtf file for my genome but none of the options seem to work as the output files look just like the bed file above.

$bedtools intersect -a sample.bed -b annotations.gtfconverted2.bed > results.bed

BEDOPS tools worked the best but missed a lot of annotations.

$bedmap --echo --echo-map --delim '\t' sample.fw.bed annotations.gtfconverted2.fwd.bed > answer.fw.bed

I processed for reverse (rv) strand too and then merged them by:

$bedops --everything answer.fw.bed answer.rv.bed > answer.bed

Any suggestions are appreciated!

bedtools intersect iCLIP CITS annotate • 2.1k views

ADD COMMENT • link updated 7.0 years ago by Pierre Lindenbaum 166k • written 7.0 years ago by CrisMar ▴ 80

score 7 · Accepted Answer · 2018-07-26

7

Entering edit mode

7.0 years ago

Pierre Lindenbaum 166k

$bedtools intersect -a sample.bed -b annotations.gtfconverted2.bed > results.bed

you're missing some options for bedtools...:

Options: 
    -wa Write the original entry in A for each overlap.

    -wb Write the original entry in B for each overlap.
        - Useful for knowing _what_ A overlaps. Restricted by -f and -r.

ADD COMMENT • link 7.0 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

Ok that makes more sense now! I tried using those options but not together. Thanks! The file has everything I need now!

ADD REPLY • link 7.0 years ago by CrisMar ▴ 80

2

Entering edit mode

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.

Upvote|Bookmark|Accept

ADD REPLY • link 7.0 years ago by Pierre Lindenbaum 166k