Fundamentals 9 min read

Master Python File I/O: Read & Write CSV, Excel, JSON, and Databases

This article provides a comprehensive guide to Python's file handling capabilities, covering built‑in functions, the csv module, numpy's load and save methods, pandas' extensive read/write utilities, Excel libraries such as xlrd and openpyxl, and popular database connectors like pymysql and sqlalchemy.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Master Python File I/O: Read & Write CSV, Excel, JSON, and Databases

This article provides a comprehensive guide to Python's file handling capabilities, covering built‑in functions, the csv module, numpy's load and save methods, pandas' extensive read/write utilities, Excel libraries such as xlrd and openpyxl, and popular database connectors like pymysql and sqlalchemy.

1. read, readline, readlines

read(): reads the entire file at once; using read(size) is recommended, larger size takes longer.

readline(): reads one line at a time; useful when memory is limited.

readlines(): reads the whole file and returns a list of lines for easy iteration.

2. Built‑in csv module

Python includes the csv module for reading and writing comma‑separated files, a common format in data science.

Example of reading a CSV file:

import csv
with open('test.csv', 'r') as myFile:
    lines = csv.reader(myFile)
    for line in lines:
        print(line)

Example of writing a CSV file:

import csv
with open('test.csv', 'w+') as myFile:
    myWriter = csv.writer(myFile)
    myWriter.writerow([7, 8, 9])
    myWriter.writerow([8, 'h', 'f'])
    myList = [[1, 2, 3], [4, 5, 6]]
    myWriter.writerows(myList)

3. numpy library

loadtxt : reads text files (including .csv) and compressed .gz/.bz2 files, assuming each row has the same number of values.

Example:

import numpy as np
np.loadtxt('test.csv', dtype=str)
# out: array(['1,2,3', '4,5,6', '7,8,9'], dtype='<U5')

load : loads .npy, .npz, or pickled files.

import numpy as np
np.save('test.npy', np.array([[1,2,3],[4,5,6]]))
np.load('test.npy')
# out: array([[1, 2, 3],
#             [4, 5, 6]])

fromfile : reads simple text or binary data saved with tofile; user must specify dtype and reshape if needed.

import numpy as np
x = np.arange(9).reshape(3,3)
x.tofile('test.bin')
np.fromfile('test.bin', dtype=np.int)
# out: array([0, 1, 2, 3, 4, 5, 6, 7, 8])

4. pandas library

pandas is a widely used data‑analysis library that can read many file formats and returns a DataFrame.

read_csv : reads CSV files.

read_excel : reads .xlsx, .xls, .xlsm files.

read_table : reads any text file by specifying the separator.

read_json : reads JSON data.

read_html : parses HTML tables.

read_clipboard : reads data from the clipboard.

read_pickle : loads pickled objects.

read_sql : executes an SQL query and returns a DataFrame.

read_hdf : reads HDF5 files, suitable for large datasets.

read_parquet : reads Parquet files.

read_sas , read_stata , read_gbq : read SAS, Stata, and Google BigQuery data respectively.

Example of reading a CSV file:

import pandas as pd
df = pd.read_csv('test.csv')

5. Working with Excel files

Beyond pandas, several libraries specialize in Excel I/O:

xlrd : reads .xls and .xlsx files.

xlwt : writes .xls files (does not support .xlsx).

xlutils : modifies existing Excel files using xlrd/xlwt.

openpyxl : reads and writes .xlsx files.

xlwings : reads/writes .xlsx, .xls, .xlsm and can manipulate Excel via the COM interface.

xlsxwriter : creates .xlsx files with formatting, charts, etc. (write‑only).

Microsoft Excel API (pywin32) : communicates directly with the Excel application for full feature support.

6. Database interaction

Python can connect to most relational and non‑relational databases. Common modules include:

pymysql and sqlalchemy : MySQL.

cx_Oracle : Oracle.

sqlite3 : built‑in SQLite.

pymssql : Microsoft SQL Server.

pymongo : MongoDB.

redis / pyredis : Redis.

After establishing a connection, standard SQL statements can be used for CRUD operations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Pythondatabasefile I/OCSVExcelNumPy
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.