Hi!
Can someone explain what is the difference between *.egenes.txt.gz and *.signif_variant_gene_pairs.txt.gz files? They are from GTEx v8 single-tissue eQTL data. Also I would want to know what ma_count and ma_samples mean?
Hi!
Can someone explain what is the difference between *.egenes.txt.gz and *.signif_variant_gene_pairs.txt.gz files? They are from GTEx v8 single-tissue eQTL data. Also I would want to know what ma_count and ma_samples mean?
eGene and significant variant-gene associations based on permutations. The archive contains a *.egenes.txt.gz and *.signif_variant_gene_pairs.txt.gz file for each tissue. Note that the *.egenes.txt.gz files contain data for all genes tested; to obtain the list of eGenes, select the rows with 'qval' ≤ 0.05.
[source: https://gtexportal.org/home/datasets]
How you use these files will depend on what are your downstream analyses. Take a look at the contents and you should be able to infer what each represents.
Regarding ma_count and ma_samples:
[[source: https://storage.googleapis.com/gtex_analysis_v8/single_tissue_qtl_data/README_eQTL_v8.txt]
Kevin
I notice the several different beta for eQTL between ENSG00000008128 and rs28544273. How to understand it? Thanks.
molecular_trait_id chromosome position ref alt variant ma_samples maf pvalue beta se type ac an r2 molecular_trait_object_id gene_id median_tpm rsid
ENSG00000008128.grp_1.contained.ENST00000356200 1 815963 T A chr1_815963_T_A 163 0.144112 0.84493 0.0240479 0.122892 SNP 178 1138 0.45909 ENSG00000008128.contained ENSG00000008128 5.626 rs28544273
ENSG00000008128.grp_1.contained.ENST00000356937 1 815963 T A chr1_815963_T_A 163 0.144112 0.777514 -0.0337367 0.119338 SNP 178 1138 0.45909 ENSG00000008128.contained ENSG00000008128 5.626 rs28544273
ENSG00000008128.grp_1.contained.ENST00000358779 1 815963 T A chr1_815963_T_A 163 0.144112 0.135332 0.178658 0.119457 SNP 178 1138 0.45909 ENSG00000008128.contained ENSG00000008128 5.626 rs28544273
ENSG00000008128.grp_1.contained.ENST00000378633 1 815963 T A chr1_815963_T_A 163 0.144112 0.355045 -0.114477 0.123676 SNP 178 1138 0.45909 ENSG00000008128.contained ENSG00000008128 5.626 rs28544273
maybe caused by different ENST?
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I have noticed that the *.signif_variant_gene_pairs file contains all the genes from the *.egenes which qval ≤ 0.05. But I do not understand why there are many variants of the same gene in *.signif_variant_gene file because in *.egenes file there are only one variant?
I understand. I think that the egenes file is merely a sort of 'annotation reference', and that *.signif_variant_gene_pairs.txt.gz file is the main one that you should use.
Look at it another way: if you filter the egenes file for
qval ≤ 0.05
, then you will arrive at the list of genes that have at least one statistically significant association. You then have to look in the other file to determine the list of SNPs that comprise this statistically significant association.You should confirm with the website, though.