Question

Why do I get DE non-polyadenylated genes in a RNAseq with PoliA library?

0

Entering edit mode

16 months ago

ev97 ▴ 40

I have recently received some RNA-seq data (PoliA library prep) and I am wondering if I can trust the expression of genes that are not described as polyadenylated, since I have encountered with some that are not polyadenylated and they have expression... and I also found some that they are significantly expressed between control and condition.

*FYI: I have checked some long non coding RNAs and their sequence, verifying that they don't have poliA tail. As well as their expression (they have actually counts).

I know that if you are interested in lncRNAs, ribodepletion is recommended. But having used a Poly(A) library for RNA-seq.... I was wondering how it is possible to get counts/expression in those genes that are not polyadenylated and they are not supposed to be enriched (and they shouldn't appear?). *I have used STAR for the alignment (plus --quantmode) and I have not specified any specific treatment for lncRNA or so.

Could I trust in the expression of those genes and the differentially expressed genes that I get from DESeq2?

Any feedback, opinion or even papers to answer this will be really appreciated.

Thanks very much in advance!

lncRNA PoliA DGE RNAseq DESeq2 • 1.2k views

ADD COMMENT • link updated 15 months ago by i.sudbery 21k • written 16 months ago by ev97 ▴ 40

1

Entering edit mode

long non coding RNAs and their sequence, verifying that they don't have poliA tail

In addition to what @ATPoint says about polyA being a statistical enrichment, rather than an absolute filter, I'm not sure you can tell if a lncRNA is polyA or not by looking at the sequence? Many lncRNAs are poly-adenylated, and you probably won't be able to tell by looking at the sequence. Even those that arn't generally polyA, such as NEAT1 for example, have minor isoforms that are poly-adenylated. Further, some gene families that are well known to use alternative termination pathways, such as replication dependent histones, will use cleavage poly-adentylation for transcript events that escape normal termination. The sequence requirements to get at least some cleavage-poly-adenylation are pretty minimal, and its fairly likely that any transcribing polymerase will hit such a sequence sooner or later if it manages to escape, say, hairpin termination.

ADD REPLY • link 15 months ago by i.sudbery 21k

0

Entering edit mode

It's enrichment, not perfect selection. Stretches of polyT can attract polyA binding. Beyond that, how did you check these genes are not polyA?

ADD REPLY • link 16 months ago by ATpoint 88k

0

Entering edit mode

Thanks very much for your reply!

"Stretches of polyT", did you mean a particular step of the poliA enrichment? (because I tried to search about it since I didn't know what you were saying, but I didn't find anything in particular).

Re the checking of genes: I checked the sequence of some particular transcripts on Ensembl (section cDNA) and NCBI (Section: NCBI Reference Sequences (RefSeq) -- RNA Sequence). And according to if they had polyA tail or not (or at least, if they had several A at the end), I assumed if they were poly-adenylated or not.

ADD REPLY • link 16 months ago by ev97 ▴ 40

1

Entering edit mode

Polyadenylation is a post-transcriptional modification, it's not encoded in the DNA and therefore not annotated in these files. You cannot select like that. With "stretches" I mean regions in the transcript that are rich in T's.

ADD REPLY • link 16 months ago by ATpoint 88k

0

Entering edit mode

I am sorry for my ignorance, I didn't check the sequence of polyadenylated genes (they don't have the polyA tail in those files either), so I should have seen that my approach was not okay.

Do you know if there is a way to check which genes/transcripts are polyadenylated and which ones not? Maybe a database (updated and reliable) that I could use? Thanks again for your help.

ADD REPLY • link 15 months ago by ev97 ▴ 40