Fundamentals 8 min read

How to Scrape and Analyze Holiday Tourist Spot Data with Python

This tutorial walks you through using Python to collect tourism data from Qunar, extract key fields such as name, price, and rating, store the results in Excel with pandas, and visualize sales and popularity trends using pyecharts, including a simple recommendation algorithm.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How to Scrape and Analyze Holiday Tourist Spot Data with Python

Goal

Use Python to analyze National Day holiday tourist attractions that are fun, cheap, and less crowded.

Data Acquisition

Official tourism data was hard to find, so we resorted to scraping ticket information from Qunar, which reflects attraction popularity.

1. Scrape Single Page Data

Search Qunar's ticket page for "National Day tourist attractions" to obtain JSON data containing name, region, popularity, sales, price, rating, coordinates, etc.

Open the browser's developer tools (F12) to locate the data URL; the response is already JSON.

Use the requests library to send a GET request and retrieve the page data.

2. Extract Relevant Information

Inspect the JSON structure and extract fields such as id, name, star rating, score, ticket price, sales volume, region, coordinates, and description.

3. Save to Excel

After extracting the needed data, save it with pandas.

pip install xlrd pip install openpyxl pip install numpy pip install pandas

4. Batch Scraping

Identify the pagination parameter (e.g., page ) from the network requests and loop over page numbers to collect all pages.

Continue fetching until the returned item count drops to zero.

Data Analysis

With the data saved, perform several analyses using the pyecharts visualization library:

Ticket sales ranking

Revenue ranking (price × sales)

Number of attractions per province and rating level (unfinished)

Sales heatmap on a map

Recommendation of attractions

1. Ticket Sales Ranking

Create a pivot table, sort by sales, and plot a bar chart. Disney ranks first.

2. Revenue Ranking

Calculate revenue as price × sales, add it to the dataframe, sort, and visualize. Disney again leads.

4. Sales Heatmap

Use Baidu Map's open API to draw a heatmap. Apply for a Baidu Map API key (browser type) and replace the key and JSON data in the demo HTML.

5. Recommendation of Attractions

Recommend spots that have high scores, low sales, and cheap prices. The simple recommendation coefficient is:

Recommendation Score = Score / (Sales * Price) * 1000

The top‑20 results include many foreign attractions, especially in Japan, showing that domestic spots are crowded during the holiday.

Feel free to improve the algorithm and experiment with the code.

Source code: https://github.com/pig6/qunar_spider

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Pythondata analysisWeb ScrapingPyechartsTourism
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.