Hello,
I got my data from UKbiobank, for 502536 subjects. I would like to determine which subjects have diabetic related complication in order to distinguish cases and controls and perform GWAS on that data.
Right now I can load my data in R:
library(ukbtools)
my_ukb_data <- ukb_df("ukb31212")
and to find ICD10 code names I can use this:
ukb_icd_keyword("diabetes", icd.version = 10)
and I get about 20 listed codes and their explanations. And the for example for E13 code:
> ukb_icd_prevalence(my_ukb_data, icd.version = 10, icd.diagnosis = "E13")
Error in ukb_icd_prevalence(my_ukb_data, icd.version = 10, icd.diagnosis = "E13") :
unused argument (icd.diagnosis = "E13")
Is this issue with the software ukbtools? or there are no subjects in my dataset associated with this E13? Do you have any other software to recommend for exploring/assessing diabetic complications from UKBiobank data?
Thanks
The UKB supplied programs (in particular
ukbconv
, https://biobank.ctsu.ox.ac.uk/crystal/download.cgi) allow you to decrypt and convert the data to any format your prefer. You are free to use R, Python, STATA or whatever statistical software you are most comfortable with to analyse the data.I wrote the R package ukbtools https://kenhanscombe.github.io/ukbtools/index.html to remove the upfront data wrangling required to marry the separate pieces of data into a single dataframe and begin analysis. It includes functionality to query disease diagnoses and demographics. It is fully documented here https://kenhanscombe.github.io/ukbtools/reference/index.html