Question

GWAS summary statistic

0

Entering edit mode

4.7 years ago

brendaumoh6 ▴ 10

Can someone please tell me how to use an already downloaded summary statistics having .txt.gz extension in MTAR for GWAS study

summary statistic • 3.3k views

ADD COMMENT • link 4.7 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

Please explain better the data that you have. 'Summary statistic' is quite general. From where did you download it?; what does it contain?

ADD REPLY • link 4.7 years ago by Kevin Blighe 88k

0

Entering edit mode

I downloaded it from GWAS,FTP download. It is for a specific traits of interest.

ADD REPLY • link 4.7 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

ADD REPLY • link 4.7 years ago by Kevin Blighe 88k

0

Entering edit mode

I guess you're not a linux user. Is this question IT related or biology related?

ADD REPLY • link 4.7 years ago by Asaf 10k

0

Entering edit mode

I am using Putty. The question is computational biology related.

ADD REPLY • link 4.7 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

I uncompressed the summary statistics and am trying to read it into R but it keeps popping up error messages.

This is the command I used;

setwd("C:/Users/Hp/Downloads/DBPannotated.csv")
data = read.table("DBPannotated.txt", sep = "\t")

This is the output;

Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'DBPannotated.txt': No such file or directory

Please I need Help to read the data into R.

ADD REPLY • link updated 4.7 years ago by Ram 44k • written 4.7 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

Thank you. Why are you you running this:

setwd("C:/Users/Hp/Downloads/DBPannotated.csv")

Do you have a directory named DBPannotated.csv or is that the file that you need to read [into R]?

Why do you then try to read a file with a 'txt' extension (DBPannotated.txt)?

A few questions:

what is the file that you want to read, and where is it located (full path)?
what is the output of getwd() when you open R?

Thank you!

ADD REPLY • link 4.7 years ago by Kevin Blighe 88k

0

Entering edit mode

Yes I have a Directory called DBPannoted.csv (I have just removed the .csv extension from the directory name).

The file was initially .txt.gz extension so I used WINzip to uncompressed it, and it was changed to .txt file.

1) The file I want to read was DBPannotated.txt (renamed to DBPannotated.csv) and the location to the full path is ("C:/Users/Hp/Downloads/DBPannotated.csv"), where DBPannoted.csv has been changed to DBPannotated.

2). getwd()

C:/Users/Hp/Downloads/DBPannotated

NB Even after I renamed both the file path directory and the file extensions it still gave me the same error output. I need assistance on how to go about it.

ADD REPLY • link updated 4.7 years ago by Ram 44k • written 4.7 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

I see!

How about this command:

list.files('.', full = TRUE)

What does it output?

ADD REPLY • link 4.7 years ago by Kevin Blighe 88k

0

Entering edit mode

> list.files('.', full = TRUE)
 [1] "./Circular-Manhattan.P.jpg"                   
 [2] "./Circular-Manhattan.trait1.trait2.trait3.jpg"
 [3] "./DBPannotated.csv.Rproj"                     
 [4] "./DBPannotated.csv.txt"                       
 [5] "./DBPannotated.txt.gz"                        
 [6] "./MANHATTAN TUTORIAL.R.RData"                 
 [7] "./Rplot.png"                                  
 [8] "./Rplot01.png"                                
 [9] "./Rplot02.png"                                
[10] "./Rplot03.png"                                
[11] "./Rplot04.png"                                
[12] "./Rplot05.png"                                
[13] "./Rplot06.png"                                
[14] "./SBPannotated.txt"                           
[15] "./SBPannotated.txt.gz"                        
>

Those are my files in that directory

ADD REPLY • link updated 4.7 years ago by Ram 44k • written 4.7 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

I see... in that case, this will work:

data = read.table("DBPannotated.csv.txt", sep = "\t")

ADD REPLY • link 4.7 years ago by Kevin Blighe 88k

0

Entering edit mode

OK. I tried:

data = read.table("SBPannotated.txt",sep = "\t")

and it worked. It is currently loading now and no more error message. Thanks a lot.

I have another challenge, I tried generating manhattan plot using MTAR package example data but the script I have to plot Manhattan plot is not compatible with their dataset since I need SNP,CHR and BP values to plots manhattan plot. The MTAR package dataset doesn't have those columns. Any idea on how to generate manhattan plots from MTAR result?

ADD REPLY • link updated 4.7 years ago by Ram 44k • written 4.7 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

Which columns do you have?

ADD REPLY • link 4.7 years ago by Kevin Blighe 88k

0

Entering edit mode

I cant view my columns cos the file is too large to view so I wanted to read it into R first before viewing it. Or do you have an idea on how to view such large files?

ADD REPLY • link 4.7 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

But you have already read the file into R, as the object data? So, just type:

colnames(data)

Let me know if that works!

ADD REPLY • link 4.7 years ago by Kevin Blighe 88k

0

Entering edit mode

It is still loading. I dont know why it is taking so long.

ADD REPLY • link 4.7 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

colnames(data) NULL

ADD REPLY • link 4.7 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

str(data)

?

ADD REPLY • link 4.7 years ago by Kevin Blighe 88k

0

Entering edit mode

str(data)
function (..., list = character(), package = NULL, lib.loc = NULL, verbose = getOption("verbose"), 
    envir = .GlobalEnv, overwrite = TRUE)

ADD REPLY • link updated 4.7 years ago by GenoMax 147k • written 4.7 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

So, it was not even able to read in the data (to R).

What was the output of

data = read.table("SBPannotated.txt",sep = "\t")

It looks like you are using Windows? - Windows is terrible for working with large datasets.

ADD REPLY • link 4.7 years ago by Kevin Blighe 88k

0

Entering edit mode

Ya you are right. Windows is very poor for large data and my data is around 7gb size. I have navigated to Linux. Hopefully it will run faster.

ADD REPLY • link 4.7 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

I am an MSc student in Bioinformatics, am currently doing my Master's project now and am working on GWAS analysis to detect SNPs that are associated with Hypertension. I will appreciate if I can get your Email address or Skype info. so I can be able to easily reach to you cos there is a limitation of messages per day on BIOSTAR. I need a supervisor who is good in Bioinformatics and currently my supervisor is not within reach and I only get through to him once in a week. I would appreciate your assistance. Thanks.

ADD REPLY • link 4.7 years ago by brendaumoh6 ▴ 10

GenoMax · Answer 1 · 2020-03-14

0

Entering edit mode

4.7 years ago

Kevin Blighe 88k

Please take a look at the examples in the vignette: Multi-trait analysis of rare-variant association summary statistics using MTAR

Use gunzip to uncompress your *..txt.gz file, and then read it into R via read.table(), read.csv(), or some other read function. Note that, depending on your version of R, these basic read functions can detect compressed data and uncompress those 'on the fly' for you.

Kevin

ADD COMMENT • link 4.7 years ago by Kevin Blighe 88k

0

Entering edit mode

I uncompress the file and I still tried reading it to R but it still shows same error message:

data = read.csv("SBPannotated",sep = "\t")
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'SBPannotated': No such file or directory

ADD REPLY • link updated 4.7 years ago by GenoMax 147k • written 4.7 years ago by brendaumoh6 ▴ 10