Bulk converting superseded TCGA UUIDs
1
0
Entering edit mode
5 months ago
loughrae ▴ 90

Hi everyone.

I have a set of (~10,000) superseded UUIDs that I need to convert to their new versions (seems to be release 12 (2018) to release 32 (2022)). UUIDhistory() from the TCGAutils R package works nicely but only for one ID at a time, and looping over all IDs is fairly slow because it has make a separate query each time. Is there a better way to do this? Or a table I can download to map UUID versions?

Thanks!

R TCGAutils TCGA • 509 views
ADD COMMENT
0
Entering edit mode

Tagging: Zhenyu Zhang

ADD REPLY
0
Entering edit mode
5 months ago
Zhenyu Zhang ★ 1.2k

You can use GDC API version endpoint.

For examples, you can search "EXAMPLE OF RETRIEVING FILE VERSION INFORMATION" in the GDC API document https://docs.gdc.cancer.gov/API/Users_Guide/Search_and_Retrieval/#example-of-retrieving-file-version-information.

Alternatively, if you already have old UUIDs in a manifest, GDC-client has a parameter (probably by default) will pull you the latest version, and also tell you which file is updated.

ADD COMMENT
0
Entering edit mode

Thanks. I see the first option is to concatenate all the IDs within the curl, which I'd like to avoid with 10,000 of them. The alternative seems to only work with manifests (not just a plain text file containing a list of UUIDs)--is there a way to convert a list of UUIDs to a manifest? Or format my UUIDs so they can be interpreted as one?

ADD REPLY
0
Entering edit mode

You can feed a manifest to the files API. I don't expect the API check columns other than "id". How about your just create a one column txt file with your uuids, and "id" as the first row.

ADD REPLY

Login before adding your answer.

Traffic: 1633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6