Hi. I'm 3 weeks into trying to work my way through analyzing gene expressions from fq files. At this point after trying at least 20 times with high p and q values (compared to results company gave me) I don't know what to do. I have an 8 step workflow.
- FastQC to check how to clean raw reads.
- Trimmomatic to clean them
- FastQC again to check if it worked. If not, go back to point 2.
- Indexing gene with STAR based on .fa and.gtf files from Ensembl
- Mapping with STAR
- Counting and preparing for ballgown with StringTie using .gtf file
- Using prepDE.py3 to create gene count table
- Analysis or results with ballgown and DESeq2 in R.
So far I've tried a lot of different options and values for them in every program used, and nothing really worked out well. Maybe I'm just dumb, that's one possibility. But I hope someone can help me understanding what I'm doing wrong. I have illumina novaseq 6000 paired ends reads.
You should find a local expert to consult/work with. There are too many points where things can go wrong for us to guess at your issue(s) based on the little info provided.
Just double checking here:
Low p and q values would be good (meaning significantly DE genes), I guess you mean few significant results?
Yes, that's what I meant. Thanks for correcting me. I've edited post to not confuse anyone.