Question

Create A Tab-Delimited Annotation Table From Csv Annotation File

0

Entering edit mode

11.8 years ago

Jason ▴ 20

Hi,

I want to create an tab-delimited annotation table from an affymetrix SNP6 CSV annotation file. I don't want the extra lines from the CSV file apart from the annotation table present in the middle of the CSV file.

I tried using "AffyCompatible" package but it downloads the latest "na" version i.e. 33 whereas I am only interested in older version "na30".

I am following the vignette from here

When I try to give a user-defined value to affxUrl argument i.e.

URL <- "http://media.affymetrix.com/analysis/downloads/na30/genotyping/GenomeWideSNP_6.na30.annot.csv.zip"
df <- readAnnotation(rsrc, annotation=anno, affxUrl=URL)

I get this error:

Error in read.table(file = file, header = header, sep = sep, quote = quote,  : unused argument(s) (affxUrl = "http://media.affymetrix.com/analysis/downloads/na30/genotyping/GenomeWideSNP_6.na30.annot.csv.zip")

Any help would be great!

Thanks

Jason

affymetrix r • 4.3k views

ADD COMMENT • link updated 11.8 years ago by Neilfws 49k • written 11.8 years ago by Jason ▴ 20

2

Entering edit mode

Can you paste some lines from input file and given an example of how your output file should look like.

ADD REPLY • link 11.8 years ago by Ashutosh Pandey 12k

1

Entering edit mode

I downloaded the file that you mentioned above. There are two files. One file explains the format and number of columns in the big file. Can you tell me which columns do you need. Probably you also want to remove the header from the big file. But I don't see any unnecessary lines at the bottom of the file.

ADD REPLY • link 11.8 years ago by Ashutosh Pandey 12k

1

Entering edit mode

I have split that big files into 12 smaller files, each with 50,000 rows. I think you can open them in excel and process them the way you want. You can download them using the link I have sent you.

ADD REPLY • link 11.8 years ago by Ashutosh Pandey 12k

0

Entering edit mode

Thanks a bunch! I'm sure its going to do the job. Still it would be great if you could tell how you did it.

ADD REPLY • link 11.8 years ago by Jason ▴ 20

0

Entering edit mode

Actually I should have given you the command on the first place. Anyways here is the command to split the files into smaller files in unix.

split -a 2 -l 50000 -d input.txt split.file.

-l tells you maximum number of lines allowed in each rows, the last string 'split.file' will be a prefix to all the new files that will be produced.

ADD REPLY • link 11.8 years ago by Ashutosh Pandey 12k

score 0 · Answer 1 · 2013-11-12

0

Entering edit mode

11.8 years ago

Neilfws 49k

You'll note that the vignette is almost 6 years old. Bioconductor changes rapidly; code is unlikely to work now.
The error indicates that you are passing "affxUrl =" as a parameter to read.table(). This is not a recognised argument to that function, hence the error.
You could extract required lines from a CSV file using grep or awk, extract fields using cut and substitute tabs for commas using sed.

ADD COMMENT • link 11.8 years ago by Neilfws 49k

0

Entering edit mode

You're right, the code is quite old. But then again I don't really have a lot of available solution or I'm probably not aware of. affxUrl() is a parameter to readAnnotation() which itself calls read.table implicitly. The help file states that the argument affxUrl() usually need not to be overwritten by the user but in this case I need to as I require a different (na30) file from a different path. The starting 22 rows of the CSV file are not required and I identified them using Excel but there are similar unnecessary lines at the end of the file. I can't really tell how many as the file doesn't completely loads into Excel and doing tail() on it shows a limited number of unnecessary lines.

ADD REPLY • link 11.8 years ago by Jason ▴ 20