Hi,
I want to create an tab-delimited annotation table from an affymetrix SNP6 CSV annotation file. I don't want the extra lines from the CSV file apart from the annotation table present in the middle of the CSV file.
I tried using "AffyCompatible" package but it downloads the latest "na" version i.e. 33 whereas I am only interested in older version "na30".
I am following the vignette from here
When I try to give a user-defined value to affxUrl argument i.e.
URL <- "http://media.affymetrix.com/analysis/downloads/na30/genotyping/GenomeWideSNP_6.na30.annot.csv.zip"
df <- readAnnotation(rsrc, annotation=anno, affxUrl=URL)
I get this error:
Error in read.table(file = file, header = header, sep = sep, quote = quote, : unused argument(s) (affxUrl = "http://media.affymetrix.com/analysis/downloads/na30/genotyping/GenomeWideSNP_6.na30.annot.csv.zip")
Any help would be great!
Thanks
Jason
Can you paste some lines from input file and given an example of how your output file should look like.
I downloaded the file that you mentioned above. There are two files. One file explains the format and number of columns in the big file. Can you tell me which columns do you need. Probably you also want to remove the header from the big file. But I don't see any unnecessary lines at the bottom of the file.
I have split that big files into 12 smaller files, each with 50,000 rows. I think you can open them in excel and process them the way you want. You can download them using the link I have sent you.
Thanks a bunch! I'm sure its going to do the job. Still it would be great if you could tell how you did it.
Actually I should have given you the command on the first place. Anyways here is the command to split the files into smaller files in unix.
split -a 2 -l 50000 -d input.txt split.file.
-l tells you maximum number of lines allowed in each rows, the last string 'split.file' will be a prefix to all the new files that will be produced.