Backend Development 6 min read

How to Scrape Qyer Travel City Data into CSV with Python

This tutorial walks you through using Python's requests, lxml, and fake_useragent libraries to crawl Qyer travel city pages, extract city names, images, and hotspots, and store the results in a CSV file while handling pagination and request headers.

Python Crawling & Data Mining

Jun 8, 2020

How to Scrape Qyer Travel City Data into CSV with Python

Project Background

Qyer.com provides original travel guides, community features, and online services. This tutorial shows how to fetch city information from Qyer and write it to a CSV file using Python.

Project Goal

Retrieve city name, image link, hotspot, and save them in a CSV file.

Libraries and Target Site

Target URL pattern: https://place.qyer.com/south-korea/citylist-0-0-{}

Required libraries: requests , lxml , fake_useragent , time , csv .

Project Analysis

Pagination URLs change only the number part (citylist-0-0-{page}). Use a loop to request multiple pages.

Implementation

Class Definition

import requests, os
from lxml import etree
import random
import time
from fake_useragent import UserAgent

class Travel(object):
    def __init__(self):
        self.url = "https://place.qyer.com/south-korea/citylist-0-0-{}/"
    def main(self):
        pass

if __name__ == '__main__':
    spider = Travel()
    spider.main()

Random UserAgent

self.film_list = []
ua = UserAgent(verify_ssl=False)
for i in range(1, 50):
    self.film_list.append(ua.chrome)
self.Hostreferer = {'User-Agent': random.choice(self.film_list)}

Multi‑page Requests

startPage = int(input("起始页:"))
endPage = int(input("终止页:"))
for page in range(startPage, endPage + 1):
    url = self.url.format(page)

Data Request Method

def get_page(self, url):
    html = requests.get(url=url, headers=self.Hostreferer).content.decode("utf-8")
    self.page_page(html)

Parsing with XPath

parse_html = etree.HTML(html)
image_src_list = parse_html.xpath('//ul[@class="plcCitylist"]/li')
for i in image_src_list:
    b = i.xpath('.//h3//a/text()')[0].strip()
    c = i.xpath('.//p[@class="beento"]//text()')[0].strip()
    d = i.xpath('.//p[@class="pics"]//img/@src')[0].strip()

Saving to CSV

csv_file = open('scrape.csv', 'a', encoding='gbk')
csv_writer = csv.writer(csv_file)
csv_writer.writerow([b, c, d])
csv_file.close()

Main Loop

def main(self):
    for i1 in range(1, 25):
        url = self.url.format(i1)
        html = self.get_page(url)
        time.sleep(2)
        print("第%d页" % i1)

Result

Run the script, input start and end pages, and the program prints each processed page, downloads data, and stores it in scrape.csv. Screenshots show successful execution.

Conclusion

Do not scrape excessively to avoid server load. The project demonstrates basic CSV handling and web‑scraping techniques with Python.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

travel data web-scraping lxml

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.