How to Scrape Qyer Travel City Data into CSV with Python
This tutorial walks you through using Python's requests, lxml, and fake_useragent libraries to crawl Qyer travel city pages, extract city names, images, and hotspots, and store the results in a CSV file while handling pagination and request headers.
Project Background
Qyer.com provides original travel guides, community features, and online services. This tutorial shows how to fetch city information from Qyer and write it to a CSV file using Python.
Project Goal
Retrieve city name, image link, hotspot, and save them in a CSV file.
Libraries and Target Site
Target URL pattern: https://place.qyer.com/south-korea/citylist-0-0-{}
Required libraries: requests , lxml , fake_useragent , time , csv .
Project Analysis
Pagination URLs change only the number part (citylist-0-0-{page}). Use a loop to request multiple pages.
Implementation
Class Definition
import requests, os
from lxml import etree
import random
import time
from fake_useragent import UserAgent
class Travel(object):
def __init__(self):
self.url = "https://place.qyer.com/south-korea/citylist-0-0-{}/"
def main(self):
pass
if __name__ == '__main__':
spider = Travel()
spider.main()Random UserAgent
self.film_list = []
ua = UserAgent(verify_ssl=False)
for i in range(1, 50):
self.film_list.append(ua.chrome)
self.Hostreferer = {'User-Agent': random.choice(self.film_list)}Multi‑page Requests
startPage = int(input("起始页:"))
endPage = int(input("终止页:"))
for page in range(startPage, endPage + 1):
url = self.url.format(page)Data Request Method
def get_page(self, url):
html = requests.get(url=url, headers=self.Hostreferer).content.decode("utf-8")
self.page_page(html)Parsing with XPath
parse_html = etree.HTML(html)
image_src_list = parse_html.xpath('//ul[@class="plcCitylist"]/li')
for i in image_src_list:
b = i.xpath('.//h3//a/text()')[0].strip()
c = i.xpath('.//p[@class="beento"]//text()')[0].strip()
d = i.xpath('.//p[@class="pics"]//img/@src')[0].strip()Saving to CSV
csv_file = open('scrape.csv', 'a', encoding='gbk')
csv_writer = csv.writer(csv_file)
csv_writer.writerow([b, c, d])
csv_file.close()Main Loop
def main(self):
for i1 in range(1, 25):
url = self.url.format(i1)
html = self.get_page(url)
time.sleep(2)
print("第%d页" % i1)Result
Run the script, input start and end pages, and the program prints each processed page, downloads data, and stores it in scrape.csv. Screenshots show successful execution.
Conclusion
Do not scrape excessively to avoid server load. The project demonstrates basic CSV handling and web‑scraping techniques with Python.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
