I cannot read data into R (read.affy)
2
1
Entering edit mode
8.6 years ago
anm17 ▴ 20

Hello community

I am facing the current problem.

I installed Bioconductor in R and have downloaded Raw data of a certain Series (GEO NBCI) trough R and was able to unzip it and save it in a new folder, called data. In order to read the cel files I created in the new folder 'data' a phenodata.txt file, this was done using the terminal(OSX: ls data/*.CEL > data/phenodata.txt). The phenodata was perfectly created I just adjusted it in Excel, in order to have three columns to look like this: Name FileName Target
GSM692115 GSM692115 control GSM692116 GSM692116 rhinovirus
GSM692117 GSM692117 cigarette smoke extract

I made sure to use only tabs and not spaces, saved then the file as txt (tab-delimited text)

Now when I want read this file into R I get following Error, which is not clear to me because I just have three columns with three columns names:

celfiles <- read.affy(covdesc="phenodata.txt", path="data") Fehler in read.table(filename, sep = sep, header = header, quote = quote, : more columns than column names

(Fehler=Error (german ^^))

I appreciate any answers,

M

R read.affy phenodata • 4.4k views
ADD COMMENT
1
Entering edit mode

Your data doesn't fit into the read.table function. Perhaps it's your header making the problem. Change " " to "_" and try it again. You can also post a snippet of your data table and check the help page for the ?read.affy function.

ADD REPLY
1
Entering edit mode
8.6 years ago

Looks like another Excel bug to me...

Sometimes, Excel saves empty cells into .txt, creating additional tabulations that mess with R read.table. To prevent this to happen, a quick solution is to reopen the file in Excel, copy and paste your data (and no other cells) into a new clean sheet. Save that sheet into .txt again (tab delimited) and you should be free of those random annoying tabulation.

ADD COMMENT
0
Entering edit mode
8.6 years ago
anm17 ▴ 20

Thank you for your answers,

I don't know if I should repost this as a new question but I tried to solve the problem with creating the phenodata file in R:

pheno_data <- data.frame( c( # Name "GSM692115.CEL", "GSM692116.CEL",....), c( # FileName "GSM692115.CEL", "GSM692116.CEL", ..."), c( # Target "control", "rhinovirus", "cigarette smoke extract", "rhinovirus and cigarette smoke extract",.....)) colnames(pheno_data) <- c("Name", "FileName", "Target") write.table(pheno_data, "datafile/phenodata.txt", row.names=FALSE, quote=FALSE)

I made sure that in the folder 'datafile' is the same amount of .cel files and same filenames like in the phenodata. And i deleted all other .chp files. But still i get the same error.

Interestingly I tried the same procedure with another Series of GEO, but the raw data included just the .cel, and it did work.

So I just downloaded the .cel files from the the first Series I'm investigating, but same error: more columns than column names....

I really don't see why is it not working for my Series :/

ADD COMMENT
0
Entering edit mode

Maybe you can try the fill=T parameter in the read.affy() function and see what it does.

ADD REPLY
0
Entering edit mode

Unfortunately it did not help, but thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 2888 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6