I have 60 samples and my PI wants a table (excel spreadsheet) containing all expressed genes in 60 samples. I have used hisat and stringtie. I don't know how to create such a table unless I code something and my coding skills are rudimentary. Or that I manually extract the expression values and paste them into an excel file 60 times using vlookup.
Is there a tool that I can use for this?
Additional information: I have geneabundance.tab files and .gtf files from stringtie.
Geneabundance.tab
Gene ID Gene Name Reference Strand Start End Coverage FPKM TPM
ENSG00000228794.9 LINC01128 chr1 + 860226 868202 0.000000 0.000000 0.000000
ENSG00000230368.2 FAM41C chr1 - 868071 876903 0.000000 0.000000 0.000000
ENSG00000234711.1 TUBB8P11 chr1 + 873292 874349 0.108696 0.021078 0.030533
.gtf file
chr1 HAVANA transcript 860226 866720 . + . gene_id "ENSG00000228794.9"; transcript_id "ENST00000671208.1"; ref_gene_name "LINC01128"; cov "0.0"; FPKM "0.000000"; TPM "0.000000";
chr1 HAVANA exon 860226 861273 . + . gene_id "ENSG00000228794.9"; transcript_id "ENST00000671208.1"; exon_number "1"; ref_gene_name "LINC01128"; cov "0.0";
You will need to tell us what format the data you currently have is.
I have geneabundance.tab and .gtf files from stringtie (updated in original post)
Just show us a snippet of each file. The files should be tabular/delimited anyway most likely. What are you defining as 'expressed' in this data?
Hopefully it is clear, there aren't commas in the real file, added them to make it a bit more clearer. By expressed the PI wants anything 0 and above (so all genes, sorry for the confusion).
Please use the formatting bar (especially the
code
option) to present your post better.You should not need to add anything that is not present in the original data in that case.
Thank you, fixed it up now
You can either use featurecounts or multiBamSummary from deeptools to generate the gene expression table.
Thanks, I'll give it a crack
Are all the tab files in the same folder? If not, how are the files stored? It's important to be specific since the code would need to loop through the files.
I have both, I thought that having each file in the same directory might make it easier. I'll be sure to clarify in future posts.
See if the answer here is of any help:
A: Unable to create file for tximport