Hi. I'm new in bioinformatics and try to process fastq files for getting raw read count matrix.
I downloaded fastq files from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63452
I used fasterq -dump to download fastq files from SRR
Aligned fastq files with ENSEMBL annotation files which are Homo_sapiens.GRCh38.104.chr.gtf & Homo_sapiens.GRCh38.dna_rm.primary_assembly.fa without any trimming
Extracted raw count matrix using featurecounts with BAM files
To check if my results are well processed, I normalized my read count matrix (CPM)
since I could get normalized data matrix from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63452.
I compared my data with normalized count data from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63452,
but the results are quite different than I thought.
I thought that the results would be a little different since I used other tools to get my result, but
when you see some results
left one is my data and right one if from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63452 normalized data. When you look at the A1BG gene, for example, there is huge difference between two data. \ What can I do to fix this problem? It seems not reasonable to use same tools everytime I try to extract raw count from fastq.