Programmer's Python Data - Text Files & CSV

Written by Mike James

Tuesday, 10 June 2025

Article Index
Programmer's Python Data - Text Files & CSV
Text Formats
The CSV Module
CSV Dialects

Page 3 of 4

The CSV Module

CSV files are very common and are used by many programs, spreadsheets for example, to export data. To make it easy to work with CSV files, Python has a csv module that will work with the different dialects of CSV. Using it is very easy and it really isn’t worth trying to create your own CSV implementation.

You first need to open the file in text mode, but with no end of line character handling because the csv module is going to be in charge of line endings. Next you need to create either a csv.reader or a csv.writer. These objects can be used to define the exact CSV dialect in use. That is, the reader and writer objects define the format and hence handling of the CSV file. Finally, you can use the reader or writer objects to read and write records.

For example, to write and read the person record in the previous section you would use something like:

import pathlib
import dataclasses
import csv
@dataclasses.dataclass

class person:
    name:str=""
    id:int=0
    score:float=0.0

me=person("mike",42,3.145)
path=pathlib.Path("myTextFile.csv")
with path.open(mode="wt",newline="") as f:
    peopleWriter=csv.writer(f)
    for i in range(43):
        peopleWriter.writerow([me.name,i,me.score])
    
with path.open(mode="rt") as f:
    peopleReader=csv.reader(f)
    for row in peopleReader:
        print(row)

You can see that the file is opened with newline="" to suppress any end of line characters that would be automatically added. If you don’t do this then the file will contain blank records due to the additional line endings added by the text file handling. Next we create a default writer object which uses the Excel spreadsheet dialect of CSV by default. Finally we write 43 records to the file changing the id value for each one for demonstration purposes. If you open the file in an Excel-compatible spreadsheet then you should see the data correctly entered into columns.

To read the file back in is just as easy – open the file, create a reader and read the rows. If you look at the output you will see that each row is returned as a list:

['mike', '42', '3.145']

You can see that all of the fields are returned as strings so there is still some processing for you to do.

Using a dictionary as the representation of the record has the advantage that you can specify field names. To do this you need to use the DictWriter and DictReader objects and specify the field names.

For example to write the person record data you would use:

with path.open(mode="wt",newline="") as f:
    peopleWriter=csv.DictWriter(f,fieldnames=["name",
                                         "id","score"])
    for i in range(43):
        peopleWriter.writerow({"name":me.name,"id":i,
                                      "score":me.score})

The field names have to be specified as a list and the dictionary that you write has to have the same field names. To read the data back you also have to specify the field names unless they are written into the file as the first record:

with path.open(mode="rt") as f:
    peopleReader=csv.DictReader(f,fieldnames=["name",
                                          "id","score"])
    for row in peopleReader:
        print(row)

A typical row is displayed as:

{'name': 'mike', 'id': '42', 'score': '3.145'}

Again all of the values are strings.

You can determine what happens if the dictionary to be written has a key that isn’t listed in the field names using the extraaction= parameter. By default it is set to raise to raise a ValueError exception but you can set it to simply ignore extra field values. On reading, if there are missing fields, then you can specify the restval parameter to provide a default value and if there are additional fields? you can specify a field name with the restkey parameter.

If you don’t specify the fields in the DictReader then the first row of the file has to specify them. You can force the DictWriter to write an initial header row with field names using the writeheader() method.

For example:

with path.open(mode="wt",newline="") as f:
    peopleWriter=csv.DictWriter(f,fieldnames=["name",
                                    "id","score"])
    peopleWriter.writeheader()
    for i in range(43):
        peopleWriter.writerow({"name":me.name,"id":i,
                                     "score":me.score})
        
with path.open(mode="rt") as f:
    peopleReader=csv.DictReader(f,)
    for row in peopleReader:
        print(row)

<< Prev - Next >>

Last Updated ( Tuesday, 10 June 2025 )

Recent Articles

Recent Book Reviews

Popular Articles

The CSV Module