Question

How read a CSV file skipping the first column using python

0

Entering edit mode

6.3 years ago

mandimunari • 0

I'm checking the presence of genes in at least 95% of the analyzed bacteria, and to do this is necessary read a CSV file using python. This file contains 15 columns corresponding to the name of the bacteria, and the rows is about the presence (value >= 1) or absence (value <= 0) of the genes, but it's necessary skip the first column and pass through each row of each column and return if the value of the lines is <=0 or >=1. So, I'm stuck in how to skip the first column using CSV library in python.

Thanks

genome python • 27k views

ADD COMMENT • link updated 6.3 years ago by steve ★ 3.5k • written 6.3 years ago by mandimunari • 0

1

Entering edit mode

You'd be best off working on this data in the pandas module. What you're asking is an XY problem. You don't need to ignore the first column specifically, you just need to query the relevant column of the data table. i.e.: print all rows of the dataframe where column X =< 0 && X >= 1.

You should give this a go first, as its not a difficult problem with the right tools, and being able to perform queries like this on a dataframe is a very powerful skill.

Make an attempt, show us what code you come up with, and we'll be happy to help you further.

ADD REPLY • link 6.3 years ago by Joe 22k

0

Entering edit mode

Please show us what you tried. This is also more programming than bioinformatics and might get closed for that reason.

ADD REPLY • link 6.3 years ago by WouterDeCoster 48k

score 2 · Answer 1 · 2019-01-10

To skip the first column:

$ printf 'foo,bar\n1,2\n' > test.csv

>>> import csv
>>> with open('test.csv') as fin:
...     reader = csv.reader(fin)
...     for row in reader:
...             print(row[1:])
...
['bar']
['2']

To skip the first row:

$ python
Python 2.7.10 (default, Feb  7 2017, 00:08:15)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import csv
>>> rows_to_keep = []
>>> with open('test.csv') as fin:
...     next(fin)
...     reader = csv.reader(fin)
...     for row in reader: rows_to_keep.append(row)
...
'foo,bar\n'
>>> print(rows_to_keep)
[['1', '2']]

I think there might be a slight syntax difference in the next function between Python 2 and 3 but this is the gist of it.

See also: https://stackoverflow.com/questions/14257373/skip-the-headers-when-editing-a-csv-file-using-python