Detection of ncRNA in RNAseq transcript expression Data: lncRNA vs lincRNA
1
0
Entering edit mode
5.1 years ago
manninm • 0

Hello, I have been asked to identify and quantify the number of ncRNA transcripts in some datasets from my lab. FYI, the data is Poly A enriched rna-seq (we aren't not doing any analysis, just attempting to document what exists in the dataset) I found some very helpful hints here at C: Non-coding RNA detection, suggesting that I awk the 3 column for "ncRNA". I am using Homo_sapiens.GRCh38.86.gtf. Further inspection of my gtf file shows that lincRNA are the only strings returned if I grep "ncRNA", and aren't in the feature column, but in the biotype column. Would I be remiss in assuming that these are the same thing, and searching for this string will be representative of ALL lncRNA? Any constructive criticism on this approach would be appreciated. Thanks!

RNA-Seq sequencing • 1.1k views
ADD COMMENT
0
Entering edit mode

Further investigation into my GTF file as revealed that lincRNAs are not the only ncRNA subtype included in the gene_subtype column. Using rtracklayer for R, I imported my GTF file, transformed into a dataframe, filter by type=gene. Vectorized the gene_biotype column, greped 'RNA' and ran unique. The number of returned values are below.

[1] "miRNA"                         "lincRNA"
 [3] "snRNA"                         "misc_RNA"
 [5] "snoRNA"                        "scaRNA"
 [7] "rRNA"                          "3prime_overlapping_ncRNA"
 [9] "bidirectional_promoter_lncRNA" "scRNA"
[11] "sRNA"                          "vaultRNA"
[13] "macro_lncRNA"                  "Mt_tRNA"
[15] "Mt_rRNA"
ADD REPLY
0
Entering edit mode

If you truly need all non-coding RNA then use all ot hese:

Abundant and functionally important types of non-coding RNAs include transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs), as well as small RNAs such as microRNAs, siRNAs, piRNAs, snoRNAs, snRNAs, exRNAs, scaRNAs and the long ncRNAs such as Xist and HOTAIR.

ADD REPLY
2
Entering edit mode
5.1 years ago
colin.kern ★ 1.1k

LincRNA means "long intergenic non-coding RNA", whereas lncRNA is just "long non-coding RNA". Many people use them interchangeably. The only scenario I can think of where there might be a lncRNA that's not a lincRNA would be a lncRNA that overlaps a coding gene on the antisense strand, but I have a feeling in the GTF annotation that would still just be labeled as a lincRNA, if that's the term they're using. I think there may also be non-coding isoforms of coding genes, but I don't know how those are defined in the annotation. I don't think those are all that common, and the studies that found them may have just found transcriptional noise (polymerase run-on or something like that). Because your library is poly-A selected, I wouldn't expect you to be able to detect any types of ncRNA other than lncRNA, and there's still uncertainty about to what extent lncRNA are polyadenylated.

ADD COMMENT
0
Entering edit mode

My thoughts exactly about not detecting any types of ncRNA in poly-A selected, but the point is to determine, if any, are present and to catalog them if they are. Thank you for your help!

ADD REPLY

Login before adding your answer.

Traffic: 2209 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6