Split excel column into different files
3
1
Entering edit mode
6.4 years ago
goh ▴ 10

Hi,

I have been trying to use multiple command i can find online to spilt the columns into different files but couldnt.

I have an 10 columns excel file ( first 2 col is paramaters, the remaining 8 col is the data). I would like to split it into 8 separate files which all 8 contains first 2 col paramaters and 1 data col.

May i know does anyone have any idea to get around this ?

Thank you.

excel • 2.1k views
ADD COMMENT
1
Entering edit mode

By the time you found the solution you would have easily done this manually. Or is this a task which you have to do repeatedly?

The link with bioinformatics is not immediately obvious. Please elaborate.

Excel is not our 'favorite' tool but if you would convert the file to a text file you would have plenty of options.

ADD REPLY
0
Entering edit mode

Its a task i have to do it repeatedly on many different files. I have posted the example above.

I dont mind to convert the excel file into other types of file to get the job done, and then convert back to excel.

ADD REPLY
1
Entering edit mode

@OP: You don't have to convert excel into other formats if your excel files are manageable in size and consists of uniform format. Both R and Python are capable of handling (input and output) excel files.

ADD REPLY
0
Entering edit mode

Excel is most of the time not appropriate for data analysis. Only if you understand very well what you are doing...

For example it is not reproducible and doesn't track what you are doing. You cannot figure out what you did to a dataset months ago and you can silently change data without ever noticing. Keep your data in a normal file and use bash/R/Python/... and save your commands. That's reproducible and you can easily explain what happened and how you obtained a result.

ADD REPLY
2
Entering edit mode
6.4 years ago

Here is the code in R: Make sure that all for the files (excel files with .xls or .xlsx) are in the same format as in link furnished above (2nd column is mz, 5 is RT and 11th column onward samples). Code would create one excel file per sample and name of the excel file would be sample_"sample name".xls. It will have three columns: mz, RT and sample name.

library(readxl)
library(WriteXLS)
df=data.frame(read_xls("example.xls"))

for (i in 11:ncol(df)){
    temp_df=df[,c(2,5,i)]
    WriteXLS(temp_df, ExcelFileName = paste0("sample_",names(df)[i],".xls"))
}

if you have multiple xls files in the folder (and only xls files of interest):

setwd("~/Desktop/test") # change this to directory of interest
fn=list.files(pattern = "\\.xls") # lists files with .xls extension
library(readxl)
library(WriteXLS)

for (i in fn){
    df=data.frame(read_xls(i))
    for (j in 11:ncol(df)){
        temp_df=df[,c(2,5,j)]
        WriteXLS(temp_df, ExcelFileName = paste0(sub('\\.xls$', '', i) ,"_","sample_",names(df)[j],".xls"))
    }
    }

output files will have "xlsfilename_samplename.xls"

ADD COMMENT
4
Entering edit mode
6.4 years ago

Good description of data. Please post some example data and expected output.@OP Export the text as tab separated file and install shell (bash preferably), cut in windows or wsl if you are using windows 10, try running following code with exported test.txt:

$ for i in $(seq 3 10); do  cut -f1,2,$i test.txt > file_$i.txt;done

using parallel (one needs to install gnu-parallel for this):

$ parallel  'cut -f1,2,{} test.txt > test_{}.txt' ::: {3..10}
ADD COMMENT
0
Entering edit mode

Hi,

Thanks. will try it out. The example is here. column B and E is my parameters. column K until R is my data which i would like to separate it into individual files. Thanks!

ADD REPLY
2
Entering edit mode
6.4 years ago

in gnu-linux, one can do this from shell (and with the help of libreoffice):

for i in *.xls;
    do libreoffice --headless --convert-to csv $i --outdir .;
    nf=$(awk -F"," '{print NF;exit}' ${i%.*}.csv);
    for j in $(seq 11 18);
        do nf_head=$(awk -v j=$j -F"," '{print $j;exit}' ${i%.*}.csv); 
    cut -d"," -f 3,5,$j ${i%.*}.csv > ${i%.*}_$nf_head.csv;
    libreoffice --headless --convert-to xls ${i%.*}_$nf_head.csv  --outdir .;
    rm ${i%.*}_$nf_head.csv;
    done;
done

Since it starts with *.xls and writes output to xls files, be careful in running the script second time if there are any issues.

ADD COMMENT

Login before adding your answer.

Traffic: 1964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6