Exporting Large Datasets to Excel Using Python (openpyxl and xlwt)
This article explains how to export massive amounts of data from Python to Excel by installing openpyxl, providing complete code examples for reading and writing both .xlsx and .xls files, comparing xlrd/xlwt with openpyxl, and recommending the best approach for large‑scale data export.
Today we share practical techniques for exporting large volumes of data from Python to Excel, enabling seamless data interchange and offering valuable reference for developers who need to handle extensive datasets.
Problem description: Excel files are often preferable to plain text for data presentation, but Python users may wonder how to export data efficiently, especially when dealing with massive amounts of records.
Step 1 – Install openpyxl: Use the command pip install openpyxl==2.2.6 (or a suitable version) to ensure compatibility across Windows and CentOS environments.
Step 2 – Code implementation (both xlwt and openpyxl versions):
# coding:utf-8 ''' # 希望对大家有帮助哈,请多提问题 create by yaoyz date: 2017/01/24 ''' import xlrd import xlwt from openpyxl.workbook import Workbook from openpyxl.writer.excel import ExcelWriter from openpyxl.utils import get_column_letter from openpyxl.reader.excel import load_workbook class HandleExcel(): """Excel相关操作类""" def __init__(self): self.head_row_labels = [u'学生ID', u'学生姓名', u'联系方式', u'知识点ID', u'知识点名称'] def read_from_file(self, filename): res_list = [] file_obj = open(filename, "r") for line in file_obj.readlines(): res_list.append(line) file_obj.close() return res_list def read_excel_with_openpyxl(self, excel_name="testexcel2007.xlsx"): wb = load_workbook(filename=excel_name) print("Worksheet range(s):", wb.get_named_ranges()) print("Worksheet name(s):", wb.get_sheet_names()) sheetnames = wb.get_sheet_names() ws = wb.get_sheet_by_name(sheetnames[0]) print("Work Sheet Title:", ws.title) print("Work Sheet Rows:", ws.get_highest_row()) print("Work Sheet Cols:", ws.get_highest_column()) row_num = ws.get_highest_row() col_num = ws.get_highest_column() data_dic = {} sign = 1 for row in ws.rows: temp_list = [] for cell in row: temp_list.append(cell.value) data_dic[sign] = temp_list sign += 1 print(data_dic) return data_dic def write_to_excel_with_openpyxl(self, records, head_row, save_excel_name="save.xlsx"): wb = Workbook() ew = ExcelWriter(workbook=wb) dest_filename = save_excel_name.decode('utf-8') ws = wb.worksheets[0] ws.title = "range names" for h_x in range(1, len(head_row) + 1): h_col = get_column_letter(h_x) ws.cell('%s%s' % (h_col, 1)).value = '%s' % (head_row[h_x - 1]) i = 2 for record in records: record_list = str(record).strip().split("\t") for x in range(1, len(record_list) + 1): col = get_column_letter(x) ws.cell('%s%s' % (col, i)).value = '%s' % (record_list[x - 1].decode('utf-8')) i += 1 ew.save(filename=dest_filename) def set_style(self, name, height, bold=False): style = xlwt.XFStyle() font = xlwt.Font() font.name = name font.bold = bold font.color_index = 4 font.height = height borders = xlwt.Borders() borders.left = borders.right = borders.top = borders.bottom = 6 style.font = font style.borders = borders return style def write_to_excel(self, dataset, save_excel_name, head_row): f = xlwt.Workbook() sheet1 = f.add_sheet(u'sheet1', cell_overwrite_ok=True) for p in range(len(head_row)): sheet1.write(0, p, head_row[p], self.set_style('Times New Roman', 250, True)) default = self.set_style('Times New Roman', 200, False) count = 1 for line in dataset: row_list = str(line).strip("\n").split("\t") for pp in range(len(row_list)): sheet1.write(count, pp, row_list[pp].decode('utf-8'), default) count += 1 f.save(save_excel_name) def run_main_save_to_excel_with_openpyxl(self): print("测试读写2007及以后的excel文件xlsx,以方便写入文件更多数据") dataset_list = self.read_from_file("test_excel.txt") head_row_label = self.head_row_labels save_name = "test_openpyxl.xlsx" self.write_to_excel_with_openpyxl(dataset_list, head_row_label, save_name) print("3. 执行完毕,由txt格式文件保存为Excel文件的任务") def run_main_save_to_excel_with_xlwt(self): print(" 4. 把txt文件读入到内存中,以list对象存储") dataset_list = self.read_from_file("test_excel.txt") head_row_label = self.head_row_labels save_name = "test_xlwt.xls" self.write_to_excel(dataset_list, head_row_label, save_name) print("6. 执行完毕,由txt格式文件保存为Excel文件的任务") if __name__ == '__main__': print("create handle Excel Object") obj_handle_excel = HandleExcel() obj_handle_excel.run_main_save_to_excel_with_openpyxl() obj_handle_excel.run_main_save_to_excel_with_xlwt()
Extended reading: The Python ecosystem provides two families of Excel libraries: the older xlrd/xlwt/xlutils which only support the legacy .xls format (max 256 columns, 65,536 rows), and the newer openpyxl which handles modern .xlsx/.xlsm files.
Three options for large‑scale export:
Save data as CSV instead of Excel.
Use openpyxl (recommended for Excel 2007+).
Use Windows‑only win32 COM automation.
Given the need for better data presentation, the article recommends using openpyxl and provides its documentation link: https://openpyxl.readthedocs.io/en/stable/ .
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.