My question is: are there tools available to further classify or investigate the "No feature" reads?
I've performed an exploratory RNA sequencing on the extracelluar RNA from single organoid cell culture. The goal of this experiment was to see if we could detect biologically relevant gene fragments from the organoid culture media, distinguishing the RNA seq profile from the control samples (culture media without organoids). Positive cellular controls were included.
- Lib prep kit: NEB Single Cell/ low input RNA library prep kit for Illumina
- Seq kit: NextSeq 2x100 bp high output
- Data processing: Cutadapt, STAR aligner, and featurecounts
- DE and data exploration: DESeq2
From the culture media samples, we have an abundance of reads that are uniquely mapping, but do not have genomic annotations:
Gene Counts: https://ibb.co/d615Zsq
Alignment Counts: https://ibb.co/thnP2S8 My question is: are there tools available to further classify or investigate the "No feature" reads? Maybe they are all rRNA fragments or unannotated non-coding RNA sequences. Any advice is appreciated!
P.S. the gene fraction indeed contains biologically relevant sequences for the cell type, which is encouraging!
Can you explain the setup a bit more detailed. So you have organoid cultures, and these produce vesicles, it that correct? These vesicles you isolated, and then extracted RNA? What are the controls here, you say "no organoids", but what does this mean in practice? I wonder, seeing these large portions of unassigned reads, whether you might simply have sequenced ambient RNA or some kind of degraded RNA, but for this a better description of the experiment would be necessary. You can include rRNAs into the GTF you quantify against, that at least tells you whether this is rRNA.
For sure. You are correct: culture, collect media, extract RNA, library prep, sequence.
The control media has been cultured in the same conditions but has seen no organoid. I still get some "ambient" RNAs from these samples but the gene expression profiles are markedly different and read counts much lower.
How do I include rRNAs into my GTF prior to STAR?
Thanks for the helpful discussion.
The distribution of A-C compared to CONs looks fairly similar to me. How do you assess "gene expression profiles are markedly different"? I mean, should you even get RNA from plain medium, isn't that rather some leftovers from the FBS or something like that?