How To Get Non-Overlapping Sequences Of Introns, Utrs, Flanking Regions For A Gene?
0
0
Entering edit mode
11.0 years ago
liux.bio ▴ 360

Hello,biostars. I am using bioconductor to get sequences of introns, UTRS, flanking regions for genes. I have a set of genes, I want gene-centered sequences of introns, UTRS, flanking regions. I am using packages such as Biostrings, GenomicFeatures. I gain sequences of UTRS, flanking regions, introns for transcripts of a gene. But I don't know how to delete the overlapping sequences(repetition sequences?). Maybe it's so naive. Any suggestions? Many thanks and Happy New year!

bioconductor utr • 3.8k views
ADD COMMENT
0
Entering edit mode

Presuming you have a GRangesList containing exons with transcript information split by gene, have you tried reduce()?

ADD REPLY
0
Entering edit mode

Yes,I tried. I read the help but I can't understand it. For a GRangesList split by transcripts, it deletes all the overlapping between the transcripts and in the transcripts,right? I will read the help carefully.Thanks!

ADD REPLY
0
Entering edit mode

Ah, you have things split by transcript. You might instead split things by gene and then use reduce (this can be easily done if you directly import the GFF/GTF file into GenomicRanges, I don't use GenomicFeatures so I can't say how things work there). If you have things split by gene, then what reduce() will do is merge overlapping exons between transcripts to create a "union gene model" (i.e., what you would get if you collapsed all of the transcripts together), which sounds like what you want. That way, there are no repeated regions.

Edit: I'll add that you can unlist() a GRangesList that's split by transcript and then split() it by gene.

ADD REPLY

Login before adding your answer.

Traffic: 3099 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6