Hello
We have exosome-sequensing (from plasma). In raw read counts file, I see 72650 gene names.
This is how my read count file looks like
I have created a percentage bar chart for categories of RNAs annotated in this exosome-seq like
Which category (RNA type) I should consider as long non-coding RNA (lncRNA)
?
Can I consider this observed24% Long intergenic non-coding RNA (lincRNA) (sense+antisense)
as long non-coding RNA (lncRNA)
?
But as I read Generally speaking we don’t expect much lncRNA/mRNA in plasma and much of that will be heavily fragmented which makes it very difficult to sequence. So how I see 24% of lincRNAs ?
If this was your data, which type of RNAs here you would considered as long non-coding RNA (lncRNA)
?
Thanks for any intuition
What ever is annotated as lncRNA at RNAcentral. Download the lncRNA's for your organism and cross check the identifiers in your list.
Thank you so much
I see this
In rfam part I could not find any lncRNAs
Am I right in searching?
If you use the search link I provided above you can narrow down the genome you are interested in and then use the
Download
button at top right to download the filtered data. e.g. these are lncRNA for mouse.Yes, thank you so much
Now I am following you
I downloaded the results from my search
The file looks like this. For instance, by searching for
URS00000E9EFC_9606
, in my raw read counts I will seeMUC5B
Is there anyway to download the results as the identifier in my raw read counts or I should convert what I have downloaded to my identifiers?
Better than that get this file that maps GENCODE ids.
AS1
in gene names stands foranti-sense RNA
.Leaving this as a future reference: There is an id_mapping file in this directory that should help.
Thank you so much GenoMax
My boss has asked me is there any RNA in my exosome-seq
I supposed he means mRNA
If so how I know from 72650 identifiers in my raw read counts which ones are RNA (mRNA) ?
If you are analyzing mRNA sequence data then you should only see mRNA's. I don't understand the question since your raw counts are for genes for which there was annotation.
Note: Kit used for the prep may be relevant to check on.
Thank you so much GenoMax actually you resolved my question throughly
I got a question which I was not able to solve yet
If I am only interested lncRNAs related to
Angiogenesis
GO term how to download such a list please?I searched for Angiogenesis in search bar and some results came and I see for instance 123 of lncRNAs are here and I want to download them
Sorry GenoMax I got another question please
How I can add GO annotation filter to my downloads?
I mean only downloading ncRNA having GO annotation
Other quick way is to check the length of RNA. The lncRNA generally is > 200 nt in length. This article may help you to identify and filter lncRNA from your dataset
It is really confusing by length, when I subtracted start from stop, I obtained more than 80% of my data longer than 200 nt while my exosomes is from plasma and I don't expect too many lncRNAs