In this post, I will summarize the most convenient way to read and write CSV files (with header) in Python.
Write CSV files
Python has a built-in CSV module which deals with CSV files. In order to write to files in CSV format, we first build a CSV writer and then write to files using this writer. I will give a simple example below:
import csv lines = [['Bob', 'male', '27'], ['Smith', 'male', '26'], ['Alice', 'female', '26']] header = ['name', 'gender', 'age'] with open("test.csv", "w", newline='') as f: writer = csv.writer(f, delimiter=',') writer.writerow(header) # write the header # write the actual content line by line for l in lines: writer.writerow(l) # or we can write in a whole # writer.writerows(lines)
In the above code snippet, the
newline parameter inside the
open method is important. If you do not use
newline='', there will an extra blank line after each line on Windows platform. The parameter
delimiter is used to denote the delimiter between different items in a line inside the CSV file.
Read CSV files
Use the CSV reader
CSV module provides a CSV reader, which we can use to read the CSV files. The CSV reader is an iterable object. We can use the following snippet to read CSV files:
import csv with open("test.csv", "r", newline='') as f: reader = csv.reader(f, delimiter=',') for l in reader: print(l) # l will be a Python list
For example, in order to read the above
test.csv file, we can use the following code:
import pandas as pd df = pd.read_csv('test.csv', delimiter=',') # df is Pandas dataframe
df in the above code will be Pandas
dataframe object. To get a certain column, we use the column name as key:
col0 = df['name'] # col0 is Pandas Series object print(col0.tolist()) # use tolist() method to make a list
tolist() method in the above code convert Pandas Series to plain Python list.
To access a certain row, we can use the
loc method with the row number.
row0 = pd.loc # row0 is Pandas Series object print(row0.tolist()) # use tolist() method to make a list
License CC BY-NC-ND 4.0