Differential expression analysis of non-coding RNA-Seq data
0
1
Entering edit mode
7.2 years ago
Assa Yeroslaviz ★ 1.9k

Hi,

I'm working on a data set from worm (C. elegans). I would like to analyse possible differential expression of ncRNA in the samples. I have five different conditions in triplicates.

for the mapping I have doewloaded the data from Ensembl - Caenorhabditis_elegans.WBcel235.ncrna.fa.

After mapping (STAR) and quantification (featureCounts), I would like to run a DE analysis using the DESeq2 or the edgeR packages.

I was wondering if the analysis can be done with similar parameters as a regular RNA-Seq analysis or do some of the parameters must be changed accordingly.

thanks in advance Assa

deseq2 edger ncRNA • 2.8k views
ADD COMMENT
3
Entering edit mode

You should map against both coding and non-coding transcripts, to avoid some reads aligning to non-coding transcripts due to lack of better alignments from the correct region.

ADD REPLY
0
Entering edit mode

This I did. In the first mapping run I have the complete gtf file containing all genes/transcripts including the ncRNA transcripts.

How can I than do a DE analysis on the ncRNA transcripts? Or should I include them in the complete analysis? When I count the features for the total RNA-Seq using featureCounts I count the genes using this command:

featureCounts -T 16 -b -a Cel.WBcel235.gtf  -t exon -g gene_id -o OUT.txt *.bam

Does it make sense to extract all the ncRNA from the gtf file and than count on transcript level?

ADD REPLY
1
Entering edit mode

something like this

featureCounts -T 15 -b -a Cel.WBcel235_subset.gtf -f -O -t exon -g transcript_id -o OUT.txt *.bam

whereas the Cel.WBcel235_subset.gtf is the file containing only the rows from the original gtf file containing ncRNA.

ADD REPLY
1
Entering edit mode

I think you're good to go, there shouldn't be a difference from coding RNA.

ADD REPLY
0
Entering edit mode

can I use the regular gtf file for the quantification of the counts?

ADD REPLY
0
Entering edit mode

I thought that you mapped against the ncRNA genes only. Is the gtf you are referring to alighned with the fasta file? If so then you should use it.

ADD REPLY
0
Entering edit mode

Yes, I aligned only against the ncRNA with the file mentioned above. But I don't have a gtf file for it, so I was going to use the gtf file from the RNA-Seq analysis. But when I'm using the original gtf file all I can't seems to find any counts. The assigned reads when running featureCounts is 0% for all samples.

e.g.

Process BAM file L18548_Track-45795.sorted.bam...                          
||    Single-end reads are included.                                          
||    Assign reads to features...                                             
||    Total reads : 8095454                                                   
||    Successfully assigned reads : 0 (0.0%)                                  
||    Running time : 0.07 minutes
ADD REPLY
0
Entering edit mode

Makes sense see @h.mon remark below. The corresponding gtf file you should generate will basically contain all the contigs in your fasta file from start to end.

ADD REPLY

Login before adding your answer.

Traffic: 1853 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6