Tag

crawling

0 views collected around this technical thread.

Python Programming Learning Circle
Python Programming Learning Circle
Aug 31, 2021 · Backend Development

Python Web Crawler for Downloading Drama Links from cn163.net

This article describes how to build a Python web crawler that automatically generates numeric URLs, checks their validity, extracts download links for TV dramas from cn163.net, saves them to text files, and discusses practical challenges such as regex parsing, filename restrictions, and multithreading performance.

PythonRequestsWeb Scraping
0 likes · 7 min read
Python Web Crawler for Downloading Drama Links from cn163.net
Python Programming Learning Circle
Python Programming Learning Circle
Dec 8, 2020 · Backend Development

Browser Spoofing Techniques for Web Scraping: Principles and CSDN Example

This article explains why web servers block crawlers, how to identify a browser's User-Agent (using Chrome as an example), and demonstrates step‑by‑step how to disguise a scraper as a browser to retrieve the CSDN homepage and its article list.

backend-developmentbrowser-spoofingcrawling
0 likes · 3 min read
Browser Spoofing Techniques for Web Scraping: Principles and CSDN Example
Python Programming Learning Circle
Python Programming Learning Circle
Nov 16, 2020 · Backend Development

Popular Python Web Scraping Frameworks and Tools

This article introduces eight widely used Python web scraping frameworks—including Scrapy, PySpider, Crawley, Portia, Newspaper, Beautiful Soup, Grab, and Cola—describing their main features, typical use cases, and providing links to their project repositories.

Data ExtractionFrameworksWeb Scraping
0 likes · 4 min read
Popular Python Web Scraping Frameworks and Tools
Architect
Architect
Nov 15, 2015 · Big Data

An Introduction to Search Engine Architecture and Core Technologies

This article provides a comprehensive overview of search engine fundamentals—including inverted indexing, tokenization, ranking, high‑concurrency infrastructure, caching, crawling strategies, query understanding, keyword rewriting, personalization, and knowledge‑base construction—highlighting the technical challenges that make modern search engines like Google superior to simpler implementations.

Big DataIndexingRanking
0 likes · 14 min read
An Introduction to Search Engine Architecture and Core Technologies
Qunar Tech Salon
Qunar Tech Salon
Oct 10, 2015 · Fundamentals

Overview of Search Engine Architecture and Core Technologies

This article provides a comprehensive overview of search engine evolution, core technologies such as crawling, indexing, retrieval and link analysis, platform foundations including cloud storage and computing, and techniques for improving search results through anti‑spam, user‑intent analysis, deduplication and caching.

Indexingcloud computingcrawling
0 likes · 15 min read
Overview of Search Engine Architecture and Core Technologies