Fundamentals 6 min read

Find Nearest Geographic Points in Large Excel Files with Python

This article demonstrates a step‑by‑step Python workflow for reading two Excel sheets, calculating haversine distances between latitude‑longitude pairs, identifying the nearest point for each record, and exporting the results, complete with code snippets, progress display, and performance advantages over traditional GIS tools.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Find Nearest Geographic Points in Large Excel Files with Python

Introduction

Hello, I'm Cui Yanfei. In many work scenarios we need to match the nearest latitude‑longitude point from table B to each point in table A. Traditional GIS tools like MapInfo are slow on large datasets, but Python can handle it efficiently and show progress.

Project Goal

Implement nearest‑point calculation between two Excel tables using Python.

Setup

Software: PyCharm

Required libraries: pandas, xlrd, os

Analysis

1) Selecting and reading the Excel files

Use os and xlrd to locate and load the target Excel files.

2) Calculating distance between two latitude‑longitude points

Read both tables with pandas and define a function (e.g., haversine) to compute distances.

3) Looping to find and store the nearest point

Iterate over rows of the first table, compare distances to all rows of the second table, and keep the record with the smallest distance.

4) Saving the results

Use DataFrame.to_excel to write the nearest‑point data to a new Excel file.

Implementation

Step 1: Import libraries

import pandas as pd
import xlrd
import os

Step 2: Choose and read the Excel file

path = "D:/a/"
# Get all Excel filenames in the folder
xlsx_names = [x for x in os.listdir(path) if x.endswith(".xlsx")]
xlsx_names1 = xlsx_names[0]               # first Excel file
aa = path + xlsx_names1
first_file_fh = xlrd.open_workbook(aa)
first_file_sheet = first_file_fh.sheets()

Step 3: Loop to compute nearest point and store it

for i in range(h1):
    w1 = df1.loc[i, '纬度']
    j1 = df1.loc[i, '经度']
    d0 = 1e25
    print("原小区第%d个。" % (i+1))
    test_dict = {'距离': [d0]}
    d3 = pd.DataFrame(test_dict)

    for l in range(h2):
        w2 = df2.loc[l, '纬度']
        j2 = df2.loc[l, '经度']
        d = haversine(j1, w1, j2, w2)
        if d < d0:
            d0 = d
            d2 = df2.loc[l, :]
            test_dict = {'距离': [d0]}
            d3 = pd.DataFrame(test_dict)
        else:
            continue

Step 4: Save the computed results

resultdata1.to_excel(excel_writer=writer, sheet_name='原小区', encoding="utf-8", index=False)
resultdata2.to_excel(excel_writer=writer, sheet_name='最近小区', encoding="utf-8", index=False)
resultdata3.to_excel(excel_writer=writer, sheet_name='距离', encoding="utf-8", index=False)
writer.save()
writer.close()

Results

Before processing:

Processing progress display:

Final result:

Conclusion

This guide shows how to compute haversine distances between latitude‑longitude pairs and perform nearest‑point matching across two tables using Python. Compared with MapInfo, the Python solution handles large datasets quickly, offers progress monitoring, and can be further optimized (e.g., chunk processing) for even better performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ExcelGeospatialdistance-calculationdata-processing
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.