Reading multiple txt files and perform Two Sample Mendelian Randomization in R
1
0
Entering edit mode
3.7 years ago

I am trying to perform two sample mendelian randomization on over 300 text files (exposure data). All the files have the same column's name ( Bacteria, chr, bp,rsID, ref.allele, eff.allele, beta, SE,Pvalue ). I have kept all the txt files in a folder.

How can I load the txt files in the folder and perform two sample mendelian randomization for each of the txt file using one Rstudio code. I tried to use for-loop. However, don't know how to replace the filename in for-loop. My ultimate goal is to perform two sample mendelian randomization analysis on each of the txt files and get the result of each analysis.

setwd("D:/Project")                                                           
data_files <- list.files("p less than 5e-8")            # Identify file names in the folder "p less than 5e-8'
data_files                                                            # print all the files 

library(TwoSampleMR)
ao <- available_outcomes()
for(i in 1:length(data_files)) {                              # Head of for-loop
  exposure_dat <- read_exposure_data(
    filename = data_files[i],
    sep = '\t',
    snp_col = 'rsID',
    beta_col = 'beta',
    se_col = 'SE',
    effect_allele_col = 'eff.allele',
    phenotype_col = '',
    units_col = '',
    other_allele_col = 'ref.allele',
    eaf_col = '',
    samplesize_col = '',
    ncase_col = '',
    ncontrol_col = '',
    gene_col = '',
    pval_col = 'P.value'
  )
  exposure_dat <- clump_data(exposure_dat)
  exposure_dat
  outcome_dat <- extract_outcome_data(exposure_dat$SNP, c('ieu-a-1058'), proxies = 1, rsq = 0.8, align_alleles = 1, 
  palindromes = 1, maf_threshold = 0.3)
  dat <- harmonise_data(exposure_dat, outcome_dat, action = 2)
  mr_results <- mr(dat)
  mr_results

The above is my for loop r script. However, this does not work.

If I input the file one by one using the following script.

filename = '2_family.Bifidobacteriaceae.txt', is the code that input the file for analysis. It works perfect.

library(TwoSampleMR)
ao <- available_outcomes()
exposure_dat <- read_exposure_data(
   filename = '2_family.Bifidobacteriaceae.txt',
   sep = '\t',
   snp_col = 'rsID',
   beta_col = 'beta',
   se_col = 'SE',
   effect_allele_col = 'eff.allele',
   phenotype_col = '',
   units_col = '',
   other_allele_col = 'ref.allele',
   eaf_col = '',
   samplesize_col = '',
   ncase_col = '',
   ncontrol_col = '',
   gene_col = '',
   pval_col = 'Pvalue'
)
exposure_dat <- clump_data(exposure_dat)
outcome_dat <- extract_outcome_data(exposure_dat$SNP, c('ieu-a-1058'), proxies = 1, rsq = 0.8, align_alleles = 1, 
palindromes = 1, maf_threshold = 0.3)
dat <- harmonise_data(exposure_dat, outcome_dat, action = 2)
mr_results <- mr(dat)
2 sample MR Multiple txt files • 1.7k views
ADD COMMENT
0
Entering edit mode
3.7 years ago
Sam ★ 4.8k

One way to do it will be use files <- list.files(pattern="csv$") which will stores all file name into the files object

You can then do a for loop with

for(i in files){
    exposure_dat <- read_exposure_data( filename = i, .....)
    .......
}
ADD COMMENT
1
Entering edit mode

Vectorization is better than loops.

Use lapply(files, read_exposure_data, sep="\t", snp_col='rsID',...)

ADD REPLY
0
Entering edit mode

Thank you, Sam. I tried your method. I think it should work. However, under the package of TwoSampleMR, a message of 'Please look at vignettes for options on running this locally if you need to run many instances of this command.' is popped out.

ADD REPLY
0
Entering edit mode

Thank you Ram. Agree lapply will be better and faster. Usage of read_exposure_data( filename, sep = "\t ", ...... ) Trying to work it out using lapply

ADD REPLY
0
Entering edit mode

It's not intuitive, but you can pass all arguments that you'd pass to read_exposure_data directly to lapply instead. lapply will pass them to each call of read_exposure_data. Example:

lapply(file, read.table, header=TRUE, sep="\t", stringsAsFactors=FALSE)
ADD REPLY

Login before adding your answer.

Traffic: 2480 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6