Question

Any advice on how to do isoform quantification?

2

Entering edit mode

10.0 years ago

cyril-cros ▴ 950

Hi, I have realized a de novo annotation of some genes of interest using https://github.com/shenkers/isoscm. I can generate a few different isoforms for a dozen of genes that are of particular interest to me. I end up with a gtf file of my various transcripts.

Could you give me your advice on how to quantify their use in a given alignment? I have already done some Northern Blot and RT-qPCR to identify isoforms and estimate their relative abundance, for a few specific genes.

What I would like to get now for each of my genes is a relative abundance along those lines: in this given gene, transcript x represents y% of all isoforms, plus or minus z%.

I have noticed this listing of quantification software (http://omictools.com/quantification-c354-p1.html) and it is a bit overwhelming to say the least.

I won't use the reference annotation (gtf) or gene regulation (gff) because I work on olfactory genes which are not well annotated. I know Cufflinks is a possible solution, but I want to use my own transcripts (which I checked using IGV).

It seems to me that Salmon might be the easiest solution...

PS: as a last remark, I have masked most of my alignment file since very few genes interest me. Computational requirements won't be an issue here.

RNA-Seq isoform transcriptome • 3.4k views

ADD COMMENT • link updated 2.4 years ago by Ram 45k • written 10.0 years ago by cyril-cros ▴ 950

1

Entering edit mode

Note that cufflinks allows you to supply your own custom GTF when run in ref-only or ref-guided mode.

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 10.0 years ago by ethan.kaufman ▴ 380

0

Entering edit mode

Yes, it does. However, one of my issue is that I mainly want to study the UTRs [Logic: there are virtually no introns in the cds of olfactory genes]. Cufflinks sometimes does weird things with the 3'UTR, and I am not sure its quantification method is very good for isoforms...

ADD REPLY • link 10.0 years ago by cyril-cros ▴ 950

1

Entering edit mode

I have had a similar problem for quantifying 3'UTR isoforms. My solution was the following:

poly-d(T) RT cDNA

size select on 8% PAGE for 75-125 bp

stranded library prep kit

Paired-end read 2x50

Filter for reads that have a seed mapping to AAAAAAAA and any 15mer mapping to the 3'UTR region of my gene of interest (zero allowed mismatches to AAAAAAAA and 1 allowed mismatch to 15mer)

The resulting file gave all candidate reads.

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 10.0 years ago by justin.gardin ▴ 10

0

Entering edit mode

I have been using published RNASeq data as a first approach, so I can't easily change or use 3'UTR specific sequencing methods. My reads are 75bp long, paired-end, unstranded, with a good quality. I am working on mice olfactory epithelia, merging 3 biological replicates for each sex. I end up with 140M reads per each sex, with ~120M uniquely aligned (using STAR). The unstranded protocol is a bit annoying...

Thanks for your suggestion though.

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 10.0 years ago by cyril-cros ▴ 950