Master Python File I/O: Read & Write CSV, Excel, JSON, and Databases
This article provides a comprehensive guide to Python's file handling capabilities, covering built‑in functions, the csv module, numpy's load and save methods, pandas' extensive read/write utilities, Excel libraries such as xlrd and openpyxl, and popular database connectors like pymysql and sqlalchemy.
This article provides a comprehensive guide to Python's file handling capabilities, covering built‑in functions, the csv module, numpy's load and save methods, pandas' extensive read/write utilities, Excel libraries such as xlrd and openpyxl, and popular database connectors like pymysql and sqlalchemy.
1. read, readline, readlines
read(): reads the entire file at once; using read(size) is recommended, larger size takes longer.
readline(): reads one line at a time; useful when memory is limited.
readlines(): reads the whole file and returns a list of lines for easy iteration.
2. Built‑in csv module
Python includes the csv module for reading and writing comma‑separated files, a common format in data science.
Example of reading a CSV file:
import csv
with open('test.csv', 'r') as myFile:
lines = csv.reader(myFile)
for line in lines:
print(line)Example of writing a CSV file:
import csv
with open('test.csv', 'w+') as myFile:
myWriter = csv.writer(myFile)
myWriter.writerow([7, 8, 9])
myWriter.writerow([8, 'h', 'f'])
myList = [[1, 2, 3], [4, 5, 6]]
myWriter.writerows(myList)3. numpy library
loadtxt : reads text files (including .csv) and compressed .gz/.bz2 files, assuming each row has the same number of values.
Example:
import numpy as np
np.loadtxt('test.csv', dtype=str)
# out: array(['1,2,3', '4,5,6', '7,8,9'], dtype='<U5')load : loads .npy, .npz, or pickled files.
import numpy as np
np.save('test.npy', np.array([[1,2,3],[4,5,6]]))
np.load('test.npy')
# out: array([[1, 2, 3],
# [4, 5, 6]])fromfile : reads simple text or binary data saved with tofile; user must specify dtype and reshape if needed.
import numpy as np
x = np.arange(9).reshape(3,3)
x.tofile('test.bin')
np.fromfile('test.bin', dtype=np.int)
# out: array([0, 1, 2, 3, 4, 5, 6, 7, 8])4. pandas library
pandas is a widely used data‑analysis library that can read many file formats and returns a DataFrame.
read_csv : reads CSV files.
read_excel : reads .xlsx, .xls, .xlsm files.
read_table : reads any text file by specifying the separator.
read_json : reads JSON data.
read_html : parses HTML tables.
read_clipboard : reads data from the clipboard.
read_pickle : loads pickled objects.
read_sql : executes an SQL query and returns a DataFrame.
read_hdf : reads HDF5 files, suitable for large datasets.
read_parquet : reads Parquet files.
read_sas , read_stata , read_gbq : read SAS, Stata, and Google BigQuery data respectively.
Example of reading a CSV file:
import pandas as pd
df = pd.read_csv('test.csv')5. Working with Excel files
Beyond pandas, several libraries specialize in Excel I/O:
xlrd : reads .xls and .xlsx files.
xlwt : writes .xls files (does not support .xlsx).
xlutils : modifies existing Excel files using xlrd/xlwt.
openpyxl : reads and writes .xlsx files.
xlwings : reads/writes .xlsx, .xls, .xlsm and can manipulate Excel via the COM interface.
xlsxwriter : creates .xlsx files with formatting, charts, etc. (write‑only).
Microsoft Excel API (pywin32) : communicates directly with the Excel application for full feature support.
6. Database interaction
Python can connect to most relational and non‑relational databases. Common modules include:
pymysql and sqlalchemy : MySQL.
cx_Oracle : Oracle.
sqlite3 : built‑in SQLite.
pymssql : Microsoft SQL Server.
pymongo : MongoDB.
redis / pyredis : Redis.
After establishing a connection, standard SQL statements can be used for CRUD operations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
