Hi,
After aligning my rna sequences, and getting bam files, I used bam-readcount in this link bam-readcount to get the total read counts per position and sample. My data is paired-end rna sequences, so after reading resources I found that I should perform fpkm to get the normalized read counts per gene, to somehow see the expression of each gene within each sample. For this I used fpkm of DESeq2 in this link fpkm. For doing so, I created my dataframe with rows representing genes and columns for each sample and each cell shows the total number of reads from different positions in a specific gene. For samples without any reads in a gene I just put 1 to prevent any error in DESeq.
For rowRangs part, I just used below piece of code:
listRanges <- lapply(genes_df$gene, function(gene){
row.cur <- genes_df[genes_df$gene == gene,]
GRanges("NC_045512.2", IRanges(row.cur$start, row.cur$end))
})
I am not sure though if it is correct or not? Then I continued with running fpkm command. The output is somehow weired for me. Some numbers which were so small became pretty big.
You can see below the first 3 samples with their original read counts:
and these are the same rows and columns after normalization:
I am skeptical if I am doing anything wrong?
No idea or guide?