Hello everyone,
I build a pipeline to analyze RNA-Seq data from fastq to FPKM or GeneCounts. It is working fine but I really need a data to compare my results with real results. The problem is, in NCBI GEO there is not any information about trimming or alignment parameters etc. I think sometimes results change with software or version and people mostly use Tophat but I will use Hisat2 because it is faster in my computer and I have many data sets. I always have results (and results makes sense) but I really need to be sure.
My question is, Is there any public tutorial or example that I can use to verify that my pipeline is working fine. FPKM or GeneCounts are not important. I can have both.
Thank you
Your results are likely going to be (atleast slightly) different than published ones (if you are not using the same aligner/settings). Pick any datasets that you are familiar with and run them through your pipeline. Don't be surprised if the results don't overlap 100% (most likely they won't). But major conclusions (top 10 DE genes etc) should be similar, if not identical.