How to Scrape and Analyze Holiday Tourist Spot Data with Python
This tutorial walks you through using Python to collect tourism data from Qunar, extract key fields such as name, price, and rating, store the results in Excel with pandas, and visualize sales and popularity trends using pyecharts, including a simple recommendation algorithm.
Goal
Use Python to analyze National Day holiday tourist attractions that are fun, cheap, and less crowded.
Data Acquisition
Official tourism data was hard to find, so we resorted to scraping ticket information from Qunar, which reflects attraction popularity.
1. Scrape Single Page Data
Search Qunar's ticket page for "National Day tourist attractions" to obtain JSON data containing name, region, popularity, sales, price, rating, coordinates, etc.
Open the browser's developer tools (F12) to locate the data URL; the response is already JSON.
Use the requests library to send a GET request and retrieve the page data.
2. Extract Relevant Information
Inspect the JSON structure and extract fields such as id, name, star rating, score, ticket price, sales volume, region, coordinates, and description.
3. Save to Excel
After extracting the needed data, save it with pandas.
pip install xlrd pip install openpyxl pip install numpy pip install pandas
4. Batch Scraping
Identify the pagination parameter (e.g., page ) from the network requests and loop over page numbers to collect all pages.
Continue fetching until the returned item count drops to zero.
Data Analysis
With the data saved, perform several analyses using the pyecharts visualization library:
Ticket sales ranking
Revenue ranking (price × sales)
Number of attractions per province and rating level (unfinished)
Sales heatmap on a map
Recommendation of attractions
1. Ticket Sales Ranking
Create a pivot table, sort by sales, and plot a bar chart. Disney ranks first.
2. Revenue Ranking
Calculate revenue as price × sales, add it to the dataframe, sort, and visualize. Disney again leads.
4. Sales Heatmap
Use Baidu Map's open API to draw a heatmap. Apply for a Baidu Map API key (browser type) and replace the key and JSON data in the demo HTML.
5. Recommendation of Attractions
Recommend spots that have high scores, low sales, and cheap prices. The simple recommendation coefficient is:
Recommendation Score = Score / (Sales * Price) * 1000
The top‑20 results include many foreign attractions, especially in Japan, showing that domestic spots are crowded during the holiday.
Feel free to improve the algorithm and experiment with the code.
Source code: https://github.com/pig6/qunar_spider
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
