Pandas issue: To write multiple csv in loop
1
0
Entering edit mode
2.5 years ago
Nai ▴ 50

I have table.csv with plant varieties linked with SNP Markers(400). One marker is occuring in 20 varities. I would like to make a separate csv file for each marker which should have variety and other columns. I am new in python and trying to use Pandas. I wrote the following:

import os
from io import open
import pandas as pd
dfs = pd.read_csv('/home/System/Variety_Marker.csv', sep='\t', encoding='latin-1', low_memory=False)

for i in dfs.groupby('MARKER'):
    tables = i 
    df = pd.DataFrame(i) # tuple change into dataframe
    df.to_csv(f"/home/System/table_{i}.csv")

OSError: [Errno 36] File name too long:

The commands are taking the values , headers and all information as naming csv file.I am thankful in advance. Please help to resolve this issue..

Pandas Python R • 4.0k views
ADD COMMENT
0
Entering edit mode

Try to print the results of f"/home/System/table_{i}.csv" and see filename its trying to write to.

ADD REPLY
0
Entering edit mode

Why is this tagged with R?

ADD REPLY
0
Entering edit mode

If I can get solution in R too

ADD REPLY
4
Entering edit mode
2.5 years ago

Hi, the main problem is that you are passing i (which is a tuple of MARKER and pd.DataFrame) to the format string. The f"{i}" returns string representation of the tuple, which includes part of your data. This is why you're getting File name too long.

What you should do:

for i, df in dfs.groupby('MARKER'):
    df.to_csv(f"/home/System/table_{i}.csv")
ADD COMMENT
0
Entering edit mode

Thank you Massa. I would like to know about df variable in for loop. I created multiple csv files. Now I would like read all csv. I have done by:

path='/home/System/PCA_1/' filenames = glob.glob(path + "/*.csv")

loop over the list of csv files

for f in filenames:

    # read the csv file
df_1 = pd.read_csv(f, sep=';')

if df1.groupby("VARIETY").get_value("C61") OR .groupby("AGE").get_value("10"):

         df.to_csv(f"/home/System/Marker_filter.csv")

I would like to mention the conditions on 5 columns in each file separately and make new file MARKER_filter.csv. I am not getting if statement on multiple columns in multiple csv file.

ADD REPLY

Login before adding your answer.

Traffic: 1825 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6