I work with RNA-seq data and have found a few deferentially expressed genes across particular tissue sample. Now I have been instructed to work with GTEX data to see the deferentially expressed genes across different tissue samples.
Now to go green with GTEX data set, I first don't understand their sample codes like
GTEX-N7MS-0007-SM-2D7W1 for which tissue?
I tried to search the bar codes for GTEX but haven't found any. Please can anyone give some idea for decoding the GTEX bar codes and also performing such analysis. I am sorry if this question is silly because I am a complete naive in the field of NGS.
The sample ID for an RNA-Seq or genotype sample is made up of the following 3 components separated by a dash, as exemplified with the example
GTEX-14753-1626-SM-5NQ9L
:GTEX-YYYYY
(e.g.,GTEX-14753
) represents the GTEx donor ID. This ID should be used to link between the various RNA-Seq and genotype samples that come from the same donor.YYYY
(e.g., "1626") mostly refers to the tissue site, BUT we do not recommend using it for tissue site designation. Sometimes sample mix-ups occur, and will be corrected however this part of the ID will not change when that happens. The accurate tissue site designation for all samples can be obtained from the "Tissue Site Detail field" (encoded as "SMTSD") in the Sample Attributes file [Datasets->Download->GTEx_Data_V6_Annotations_SampleAttributesDS.txt
].SM-YYYYY
(e.g.,SM-5NQ9L
) is the RNA or DNA aliquot ID used for sequencing.Y
stands for any number or capital letter.SOURCE: https://sites.google.com/broadinstitute.org/gtex-faqs/home
Dear guys,
May I consult the file
phs000424.v7.pht002742.v7.p2.c1.GTEx_Subject_Phenotypes.GRU.txt
is the same ashttps://storage.googleapis.com/gtex_analysis_v7/annotations/GTEx_v7_Annotations_SubjectPhenotypesDS.txt
, as for the phenotypes file, no file named like phsXXX, or this file belongs to the protected file?Thank you very much for your guidance! Best!