Entering edit mode
3.3 years ago
bart
▴
50
Hi all,
I'm trying to diff. expressed lncRNAs between two groups (of humans). I wanted to use the following pipeline: trimmomatic --> stringtie/cufflinks --> Cuffmerge/stringtie merge --> FEELnc to find lncRNAs. To find diff. expressed transcripts I want to use the following pipeline: trimmomatic --> stringtie/cufflinks --> Cuffmerge/stringtie merge --> Cuffdiff/ballgown. However, I am somewhat confused about the following:
- in what step are the lncRNA GTF files produced by FEELnc used?
- is it not possible to skip FEELnc and use the grch38 lncRNA files on the gencode page?: https://www.gencodegenes.org/human/. What is the benefit of using FEELnc over just annotating to the gencode lncRNA GTFs - is it just finding novel lncRNAs in my samples?
Thanks!
Wouldn't a more standard DE analysis pipeline using e.g. featurecounts or salmon/kallisto and DEseq2 do the same or better job? Especially, given that many human ncRNAs are already well-annotated. You could add your de-novo detected lncRNAs to the GFF/GTF file and then run the DE analysis. Whatever you do I would include all genes, also the protein coding ones into the analysis, this might lead to a more accurate library size normalization.
Thanks for your response. So if I understand it correctly, you would suggest the following?: Trimmomatic --> hisat2/star etc --> stringtie/cufflinks --> Cuffmerge/stringtie merge --> FEELnc to find novel lncRNAs in a GTF file. Then: Combine FEELnc GTF file with gencode human annotation file which includes all transcripts (including protein coding ones) --> use this new GTF file in featurecounts or salmon quant/kallisto quant --> deseq2
In case others might be having similar questions, I received some excellent answers and suggestions on this topic from the developers from FEELnc which can be found on their GitHub page: https://github.com/tderrien/FEELnc/issues/49
Hi, theoretically, I think so, yes. However, I would also try to look for and filter (near) identical overlapping ncRNA between de-novo and gencode. The cutoff and parameters for calling transcripts identical may need some additional thought. It might be informative to make a precision/recall plot with your de-novo transcripts vs. all annotated ncRNA transcripts.