I would like to find the TPM counts for the GSE102073 study. When I downloaded the raw data from GEO, the raw data are featureCounts output.
First part of the file:
# Program:featureCounts v1.4.3-p1; Command:"/data/NYGC/Software/Subread/subread-1.4.3-p1-Linux-x86_64/bin/featureCounts" "-s" "2" "-a" "/data/NYGC/Resources/ENCODE/Gencode/gencode.v18.annotation.gtf" "-o" "/data/analysis/LevineD/Project_LEV_01204_RNA_2014-01-30/Sample_JB4853/featureCounts/Sample_JB4853_counts.txt" "/data/analysis/LevineD/Project_LEV_01204_RNA_2014-01-30/Sample_JB4853/STAR_alignment/Sample_JB4853_Aligned.out.WithReadGroup.sorted.bam"
Geneid Chr Start End Strand Length /data/analysis/LevineD/Project_LEV_01204_RNA_2014-01-30/Sample_JB4853/STAR_alignment/Sample_JB4853_Aligned.out.WithReadGroup.sorted.bam
ENSG00000223972.4 chr1;chr1;chr1;chr1 11869;12595;12975;13221 12227;12721;13052;14412 +;+;+;+ 1756 0
ENSG00000227232.4 chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1 14363;14970;15796;16607;16854;17233;17498;17602;17915;18268;24734;29321;29534 14829;15038;15947;16765;1705
How can I convert this into tpm counts?
I tried the method from this post but it requires a counts file
which I don't have access to; or this post but I am confused on how to use tximport
to get the tpm counts nor the input variable featureLength
and meanFragmentLength
.
Thank you.
This file is your counts file, isn't it?
featureCounts file
Yes, I thought the featureCounts file is your counts file.