Hello.
The question is similar to that how to get fasta or fastaQ sequences of the differentially expressed gene which i got using edgeR I have a dataset of RNA sequencing results (bacteria, if it is important). The dataset is processed using cufflinks (including cuffdiff) and visualized using cummeRbund, and all was well except of one thing: The following code
> gene_gene_data1 <- subset(gene_diff_data, (p_value<0.0001))
gave me a table with differently expressed genes, and now I need to obtain sequences of all these genes to analyze them. So what should I do to make my table looking like that (link)?
https://www.frontiersin.org/articles/10.3389/fmicb.2020.01808/full#supplementary-material
Or how to make a separate list of sequences of DEGs? gffread is not what I want, because gffread extracts all transcript sequences, and I need to extract only sequences of DEGs.
Thank you.
Which bacterial species is it? It would be important to know since you would need a genome assembly for it (in some format).
Gordonia, and i have its genome assembly. I used it as reference during transcriptome assembly.