Entering edit mode
5.5 years ago
a.james
▴
240
Hello All,
I have exon datasets (aligned BAM) files downloaded. I now need the metadata information for the same samples. How can I download or extract them given the manifest information of all samples.
I have read through the GDC API, however, I am not clear how could get the metadata as tsv file.
I saw the following shell script using curl.
curl --request POST --header "Content-Type: application/json" --data @Payload.txt 'https://api.gdc.cancer.gov/files' > File_metadata.txt
However, I dont understand where to give a list or set of UUID
from the maifest file.
My questions:
- I Have UUIDs in my manifest files for all samples or datset I have downloaded, now I need metadata file in
tsv
format. - How can download it? is there a shell or python script for the same ?
Any help/suggestions are appreciated !
What information in metadata exactly that you're looking for? Hope the following codes could provide some ideas about this (make sure to have jq installed):
Which returns:
Other metadata could be (vary from UUID to UUID):
Thank you I need the following information in the metadata file,
However, I I can have them also in from BAM header, but was looking for a programmatic way, way.
This thread continued here: C: Sample names for TCGA data from GDC-legacy archive
Thanks, Kevin! For the cross-reference.