Tagged articles

Web Scraping

472 articles · Page 4 of 5

Dec 14, 2020 · Backend Development

Create a Python iQIYI Movie Scraper with GUI – Full Step‑by‑Step Guide

Learn how to build a Python web scraper that extracts iQIYI movie titles, actors, and scores, parses the data with regex or BeautifulSoup, displays results in a Tkinter GUI with a combobox, and saves the information to a file—all explained with code snippets and screenshots.

GUIPythonTkinter

0 likes · 7 min read

Create a Python iQIYI Movie Scraper with GUI – Full Step‑by‑Step Guide

Python Crawling & Data Mining

Dec 13, 2020 · Backend Development

How to Scrape and Analyze Beijing Dank Apartment Data with Python

This article demonstrates how to crawl 6,025 Beijing Dank Apartment listings using Python, clean and enrich the data with Pandas, and visualize distribution, price, size, floor, and subway proximity through charts, revealing key market insights and correlation patterns.

PandasPythonWeb Scraping

0 likes · 16 min read

How to Scrape and Analyze Beijing Dank Apartment Data with Python

Python Crawling & Data Mining

Dec 11, 2020 · Backend Development

How to Scrape Weibo Comments with Python: A Step‑by‑Step Guide

This article explains how to locate Weibo comment APIs, work around rate limits by using the mobile site, extract required parameters, and implement a Python script that handles cookies, pagination, emoji cleaning, deduplication, and scheduled execution to collect comments efficiently.

AutomationPythonWeb Scraping

0 likes · 5 min read

How to Scrape Weibo Comments with Python: A Step‑by‑Step Guide

Python Programming Learning Circle

Dec 8, 2020 · Backend Development

Scrapy Crawl Template for Automatically Extracting JD.com Product Information

This article provides a step‑by‑step guide on using Scrapy’s crawl template to automatically scrape product details such as ID, title, shop name, shop link, and price from JD.com, including source analysis, project setup, code snippets, and result verification.

Backend DevelopmentJD.comPython

0 likes · 4 min read

Scrapy Crawl Template for Automatically Extracting JD.com Product Information

Python Crawling & Data Mining

Dec 5, 2020 · Big Data

How to Build a Python Web Scraper for Zhihu Answers and Generate Word Clouds

This article walks through the complete process of designing a Python web scraper to collect Zhihu answer data, parse author, ID and excerpt fields, store the results in CSV files, and finally visualize the text with a word‑cloud, including all necessary code snippets and explanations.

CSVPythonWeb Scraping

0 likes · 17 min read

How to Build a Python Web Scraper for Zhihu Answers and Generate Word Clouds

Python Crawling & Data Mining

Dec 2, 2020 · Artificial Intelligence

Scrape, Clean, and Visualize Tencent Video Comments with Python – A Full Guide

This article walks through using Python to crawl Tencent Video's "Offer" season 2 comments, merge and clean the CSV data, perform exploratory analysis, generate visualizations and word clouds, and apply Baidu's open‑source NLP model for sentiment scoring, providing complete code snippets for each step.

PythonSentiment AnalysisWeb Scraping

0 likes · 16 min read

Scrape, Clean, and Visualize Tencent Video Comments with Python – A Full Guide

Python Crawling & Data Mining

Nov 30, 2020 · Backend Development

Build a Python Movie Scraper: Download Films from FilmSky with Ease

This guide walks you through setting up a Python environment, installing required libraries, constructing a FilmSky scraper class, handling pagination, parsing HTML with regex, and saving movie titles and download links, enabling you to browse and download movies from the FilmSky website efficiently.

PythonWeb Scrapingmovie downloader

0 likes · 6 min read

Build a Python Movie Scraper: Download Films from FilmSky with Ease

Huawei Cloud Developer Alliance

Nov 26, 2020 · Backend Development

How to Crawl Real-Time Data with Python WebSocket: A Step‑by‑Step Guide

This article explains how crawler engineers can fetch real‑time data such as sports scores, stock quotes, or cryptocurrency prices by comparing polling and WebSocket approaches, introducing the aiowebsocket library, and providing complete Python code to perform handshake, subscription, and continuous data streaming.

PythonReal-time DataWeb Scraping

0 likes · 10 min read

How to Crawl Real-Time Data with Python WebSocket: A Step‑by‑Step Guide

Python Crawling & Data Mining

Nov 16, 2020 · Backend Development

How to Crawl Next‑Page Articles with Scrapy: A Step‑by‑Step Guide

This tutorial shows how to locate the "next page" link on a website, extract its URL using Scrapy selectors, add proper checks, and integrate the pagination logic into a Scrapy spider so that all article pages are crawled automatically.

CrawlerPythonScrapy

0 likes · 6 min read

How to Crawl Next‑Page Articles with Scrapy: A Step‑by‑Step Guide

Python Crawling & Data Mining

Nov 13, 2020 · Backend Development

Master Scrapy Requests: Download Pages and Trigger Callbacks Efficiently

This tutorial explains how to use Scrapy's Request objects to feed article detail URLs into the crawler, configure callbacks for parsing, handle relative URLs with urljoin, and yield requests so Scrapy can download pages, completing the core data extraction workflow.

PythonScrapyWeb Scraping

0 likes · 5 min read

Master Scrapy Requests: Download Pages and Trigger Callbacks Efficiently

Python Crawling & Data Mining

Nov 12, 2020 · Big Data

Scrape and Visualize the Hurun Rich List (2015‑2020) with Python & Pyecharts

This article demonstrates how to scrape the Hurun Rich List from 2015 to 2020 using Python, clean the data, and create interactive visualizations of the top 20 wealth holders, wealth trends over six years, and industry shifts with Pyecharts.

Big DataPythonWealth Analysis

0 likes · 4 min read

Scrape and Visualize the Hurun Rich List (2015‑2020) with Python & Pyecharts

MaGe Linux Operations

Oct 27, 2020 · Backend Development

Build a Distributed Scrapy Crawler in Minutes with RabbitMQ and RedisBloom

This guide walks you through installing Scrapy-Distributed, setting up RabbitMQ and RedisBloom containers, creating a sitemap spider, configuring the distributed scheduler and dupefilter, and running the spider, while explaining why this non‑intrusive solution improves over existing Scrapy‑Redis and scrapy‑rabbitmq approaches.

PythonRabbitMQRedisBloom

0 likes · 7 min read

Build a Distributed Scrapy Crawler in Minutes with RabbitMQ and RedisBloom

Python Crawling & Data Mining

Oct 24, 2020 · Backend Development

Master Scrapy: Extract Likes, Comments, and Content with XPath

This article continues a Scrapy tutorial by showing how to extract like counts, comment counts, and full article content using XPath selectors, regular expressions, and debugging techniques, providing step‑by‑step code examples and screenshots to help Python developers automate web data collection.

PythonScrapyWeb Scraping

0 likes · 6 min read

Master Scrapy: Extract Likes, Comments, and Content with XPath

Python Crawling & Data Mining

Oct 20, 2020 · Backend Development

Master Web Scraping with XPath: A Step‑by‑Step Scrapy Tutorial

This tutorial shows how to apply XPath expressions within the Scrapy framework to extract titles, publication dates, tags, content, likes, favorites, and comments from a sample website, providing practical code snippets and tips for reliable web data collection.

PythonScrapyWeb Scraping

0 likes · 5 min read

Master Web Scraping with XPath: A Step‑by‑Step Scrapy Tutorial

Python Crawling & Data Mining

Oct 10, 2020 · Backend Development

How to Batch Download 4K Wallpapers from Wallhaven Using Python

This tutorial explains how to use Python to scrape Wallhaven for 4K wallpapers, automatically iterate through pages, download images in bulk, and handle request headers and delays to avoid blocking, providing complete code and result screenshots.

4K WallpapersAutomationPython

0 likes · 6 min read

How to Batch Download 4K Wallpapers from Wallhaven Using Python

MaGe Linux Operations

Oct 9, 2020 · Backend Development

Python Web Scraping Essentials: GET/POST, Proxies, Cookies, and Multithreading

Learn how to efficiently build Python web scrapers by mastering basic GET and POST requests, configuring proxy IPs, handling cookies, spoofing browser headers, enabling gzip compression, and leveraging multithreaded concurrency to accelerate data extraction.

PythonWeb Scrapingcookies

0 likes · 4 min read

Python Web Scraping Essentials: GET/POST, Proxies, Cookies, and Multithreading

Python Crawling & Data Mining

Oct 2, 2020 · Backend Development

How to Bypass Captchas with Selenium and Tesseract: A Step‑by‑Step Python Guide

This tutorial walks through using Selenium to handle pop‑ups and simple numeric captchas on a web portal, captures the captcha image, applies binary thresholding, recognizes the text with Tesseract OCR, and then submits the login credentials, including retry logic for failed recognitions.

AutomationPythonSelenium

0 likes · 9 min read

How to Bypass Captchas with Selenium and Tesseract: A Step‑by‑Step Python Guide

Python Crawling & Data Mining

Oct 1, 2020 · Backend Development

How to Scrape and Download All King of Glory Hero Skins with Python

This tutorial walks you through using Python's requests, lxml, and JSON parsing to automatically crawl the King of Glory website, extract hero skin URLs, and download the images into organized folders while displaying progress in the console.

Image DownloadPythonWeb Scraping

0 likes · 9 min read

How to Scrape and Download All King of Glory Hero Skins with Python

MaGe Linux Operations

Sep 28, 2020 · Backend Development

Build a Scalable Python Web Scraper for 3000+ Companies

This article walks through creating a Python web scraper that extracts financial data for over three thousand listed companies, starting from a simple pandas script and progressively adding error handling, MySQL storage, and multiprocessing to build a robust, production‑ready tool.

MultiprocessingPythonWeb Scraping

0 likes · 7 min read

Build a Scalable Python Web Scraper for 3000+ Companies

Python Crawling & Data Mining

Sep 17, 2020 · Big Data

How I Scraped 4,400 Taobao 'Big Pants' Listings and Uncovered Market Insights

Using Python Selenium, the author collected 4,403 Taobao listings for men's shorts, cleaned the data, visualized regional sales, price distribution, top shops, and product characteristics, ultimately identifying the best-selling items and revealing market trends.

PythonSeleniumTaobao

0 likes · 7 min read

How I Scraped 4,400 Taobao 'Big Pants' Listings and Uncovered Market Insights

Python Crawling & Data Mining

Sep 10, 2020 · Backend Development

How to Query Chinese Courier Tracking Info with Python and Kuaidi100 API

This tutorial shows how to use Python's urllib and json libraries to call the Kuaidi100 API, retrieve real‑time logistics data for various Chinese courier companies, and display the tracking timeline, while explaining how to discover the correct request URL via browser dev tools.

APIPythonWeb Scraping

0 likes · 6 min read

How to Query Chinese Courier Tracking Info with Python and Kuaidi100 API

Python Crawling & Data Mining

Sep 4, 2020 · Big Data

How to Scrape and Visualize 3,000 Chinese Recipes with Python

This article demonstrates how to use Python to crawl 3,032 Chinese recipe entries from Douguo.com, clean the data with Pandas, and create insightful visualizations—including rating distributions, cuisine comparisons, and ingredient word clouds—using pyecharts, providing complete code snippets and analysis of the results.

Chinese CuisinePandasPyecharts

0 likes · 15 min read

How to Scrape and Visualize 3,000 Chinese Recipes with Python

Python Crawling & Data Mining

Aug 17, 2020 · Big Data

What Bilibili Viewers Really Talk About: Python Scraping & Danmu Trend Analysis

This article demonstrates how to use Python to crawl over 200,000 Bilibili comments, analyze popular memes across different content categories, and presents a step‑by‑step guide to extracting danmu via network requests with code examples for practical data mining.

BilibiliDanmuPython

0 likes · 11 min read

What Bilibili Viewers Really Talk About: Python Scraping & Danmu Trend Analysis

MaGe Linux Operations

Aug 14, 2020 · Backend Development

Master Python's Requests Library: Quick Guide to GET, POST, Proxies, Sessions & SSL

This tutorial introduces Python's Requests library, covering installation, basic GET and POST requests, adding headers and parameters, handling proxies, cookies, sessions, and disabling SSL verification, with clear code examples for each feature.

APIHTTPPython

0 likes · 7 min read

Master Python's Requests Library: Quick Guide to GET, POST, Proxies, Sessions & SSL

Python Crawling & Data Mining

Aug 12, 2020 · Backend Development

Run Obfuscated JavaScript Directly in Python with ExecJS – A Quick Guide

This article explains why Python developers often need to run JavaScript—especially obfuscated code—during web scraping, demonstrates how to use the ExecJS library to execute both simple and heavily obfuscated JavaScript from Python, and shows that the approach works seamlessly without de‑obfuscation.

ExecJSJavaScriptPython

0 likes · 7 min read

Run Obfuscated JavaScript Directly in Python with ExecJS – A Quick Guide

vivo Internet Technology

Aug 5, 2020 · Frontend Development

Using Puppeteer for Emoji Scraping, Headless Chrome, and Front‑End Automation Testing

The article demonstrates how to use Puppeteer—a Node.js API built on the Chrome DevTools Protocol—to run headless Chrome for tasks such as scraping Google emoji images, generating screenshots or PDFs, and automating front‑end tests by launching a browser, navigating pages, handling cookies, simulating user input, capturing responses, and saving results.

Automation testingHeadless ChromeNode.js

0 likes · 15 min read

Using Puppeteer for Emoji Scraping, Headless Chrome, and Front‑End Automation Testing

Python Crawling & Data Mining

Aug 4, 2020 · Backend Development

4 Ways to Run JavaScript from Python for Web Scraping

This tutorial explains four practical methods—PyExecJS, js2py, Node.js, and PyV8—to execute JavaScript code from Python, providing code examples and tips for handling encrypted parameters during web crawling.

JavaScriptNode.jsPyV8

0 likes · 7 min read

4 Ways to Run JavaScript from Python for Web Scraping

Python Crawling & Data Mining

Jul 29, 2020 · Backend Development

How to Build a Python Web Scraper that Automatically Downloads and Organizes Images

This tutorial walks you through creating a Python web scraper that extracts images from a target site, handles anti‑scraping measures, saves them into categorized folders, and logs the download results, while explaining the required libraries, code structure, and best practices.

Image DownloadPythonWeb Scraping

0 likes · 6 min read

How to Build a Python Web Scraper that Automatically Downloads and Organizes Images

MaGe Linux Operations

Jul 28, 2020 · Fundamentals

Top 8 Python Tools Every Programmer and Student Should Know

This article reviews eight essential Python tools—including IDLE, Scikit‑learn, Theano, Selenium, TestComplete, BeautifulSoup, Pandas, and PuLP—explaining their main features, typical use cases, and why they are valuable for developers and students across web, data science, automation, and optimization tasks.

PythonWeb Scrapingdata science

0 likes · 5 min read

Top 8 Python Tools Every Programmer and Student Should Know

Python Crawling & Data Mining

Jul 25, 2020 · Backend Development

How to Scrape Upcoming Movies from Maoyan with Python: A Step‑by‑Step Guide

Learn how to build a Python web scraper that fetches upcoming movie details from Maoyan.com, covering environment setup, URL pagination, random User‑Agent handling, HTML parsing with XPath, data extraction, and result display, while highlighting best practices and limitations.

MaoyanPythonWeb Scraping

0 likes · 6 min read

How to Scrape Upcoming Movies from Maoyan with Python: A Step‑by‑Step Guide

Python Crawling & Data Mining

Jul 15, 2020 · Backend Development

How to Scrape Meituan Food Data with Python: Step-by-Step Guide

This tutorial explains how to analyze Meituan food page URLs, use browser developer tools to locate AJAX JSON responses, construct Python requests with proper headers, extract restaurant information via regular expressions, and save the results to a local file.

AJAXPythonWeb Scraping

0 likes · 8 min read

How to Scrape Meituan Food Data with Python: Step-by-Step Guide

Efficient Ops

Jul 13, 2020 · Operations

What 13,966 Ops Job Listings Reveal About Salary, Skills, and Hot Cities

This article analyzes 13,966 Chinese operations‑engineer job postings scraped from 51job, cleaning the data with Python and Pandas, then visualizing industry demand, city concentration, salary ranges, education requirements, company size distribution, and keyword trends to guide job seekers and recruiters.

Data VisualizationOperationsPandas

0 likes · 14 min read

What 13,966 Ops Job Listings Reveal About Salary, Skills, and Hot Cities

Python Crawling & Data Mining

Jul 11, 2020 · Operations

What 13,966 Ops Job Listings Reveal About Salary, Skills, and Hot Cities?

This article analyzes 13,966 Chinese operations engineering job postings collected from 51job, detailing scraping methods, data cleaning steps, and visualizations that uncover top hiring industries, city demand, salary ranges, required education, company size distribution, and keyword trends for the ops market.

Job marketOperationsPython

0 likes · 13 min read

Python Crawling & Data Mining

Jul 9, 2020 · Big Data

How to Build a Python Web Scraper for Job Listings and Bypass Anti‑Scraping Measures

This tutorial explains how to crawl 58.com job listings with Python, extract location, company, and salary information, handle anti‑scraping defenses using realistic headers and random User‑Agents, and save the results into a text file.

PythonWeb Scrapinganti-scraping

0 likes · 7 min read

How to Build a Python Web Scraper for Job Listings and Bypass Anti‑Scraping Measures

Python Crawling & Data Mining

Jul 8, 2020 · Backend Development

Master Python’s Requests: From Basics to Advanced Web Scraping Techniques

This tutorial introduces Python’s Requests library, covering installation, core methods like GET, POST, PUT, PATCH, DELETE, detailed parameters, session handling, exception management, header customization, proxy usage, and practical code examples to empower effective web scraping.

APIHTTPPython

0 likes · 14 min read

Master Python’s Requests: From Basics to Advanced Web Scraping Techniques

Python Crawling & Data Mining

Jun 29, 2020 · Big Data

Uncovering 200k iQiyi Danmu: Python Scraping & Insightful Analysis of “The Bad Kids”

The article demonstrates how to scrape over 200,000 iQiyi danmu comments for the drama “The Bad Kids” using Python, then analyzes user activity, episode popularity, top comments, actor mentions, and visualizes the results with word clouds and charts.

DanmuPythonWeb Scraping

0 likes · 9 min read

Uncovering 200k iQiyi Danmu: Python Scraping & Insightful Analysis of “The Bad Kids”

Python Programming Learning Circle

Jun 28, 2020 · Backend Development

Scraping iQiyi Bullet Comments and Generating a Word Cloud with Python

This article demonstrates how to scrape bullet comments from iQiyi for the first episode of a popular mystery series, decode the binary files, extract the text, and use Python's jieba and wordcloud libraries to clean the data and generate a visual word cloud of audience sentiments.

PythonWeb Scrapingdata processing

0 likes · 7 min read

Scraping iQiyi Bullet Comments and Generating a Word Cloud with Python

Python Programming Learning Circle

Jun 27, 2020 · Big Data

Analyzing the BI Engineer Job Market on Zhaopin.com with Python and FineBI

This article demonstrates how to scrape BI engineer job listings from Zhaopin.com using Python, clean and enrich the data, and create visual analytics with FineBI to reveal salary trends, city distribution, experience levels, and education requirements in the Chinese job market.

BIFineBIJob market

0 likes · 9 min read

Analyzing the BI Engineer Job Market on Zhaopin.com with Python and FineBI

Python Crawling & Data Mining

Jun 26, 2020 · Backend Development

What Hidden Secrets Do 4,277 WeChat IDs Reveal? A Python Scraping Study

Using Python to scrape 4,277 Zhihu answers about changing WeChat IDs, this article shows how the data was collected, presents the extraction code, and uncovers six surprising patterns in users' chosen usernames, illustrating social trends behind the recent feature update.

PythonSocial MediaWeb Scraping

0 likes · 8 min read

What Hidden Secrets Do 4,277 WeChat IDs Reveal? A Python Scraping Study

Python Crawling & Data Mining

Jun 20, 2020 · Artificial Intelligence

Essential Python Libraries for Data Acquisition, Cleaning, Visualization & Modeling

The article provides a comprehensive guide to Python libraries essential for data analysis, detailing tools for data acquisition (Selenium, Scrapy, Beautiful Soup), cleaning (spaCy, NumPy, pandas), visualization (Matplotlib, Pyecharts), modeling (scikit‑learn, PyTorch, TensorFlow), model inspection (LIME), audio (Librosa), image processing (OpenCV, scikit‑image), database access (PyMongo) and web deployment (Flask, Django).

PythonWeb Scrapinglibraries

0 likes · 12 min read

Essential Python Libraries for Data Acquisition, Cleaning, Visualization & Modeling

Python Crawling & Data Mining

Jun 15, 2020 · Backend Development

Build a Python Video Downloader with Web Scraping and Progress Bar

This tutorial walks through creating a Python script that scrapes video URLs from a web page, downloads them in batches with proper buffering, and displays a real‑time progress bar, covering page analysis, code implementation, and practical tips.

AutomationPythonWeb Scraping

0 likes · 6 min read

Build a Python Video Downloader with Web Scraping and Progress Bar

Python Crawling & Data Mining

Jun 13, 2020 · Fundamentals

How to Scrape Recipes from XiaChuFang with Python: A Step‑by‑Step Guide

This tutorial walks you through building a Python web scraper that extracts recipe names, ingredients, and download links from the XiaChuFang cooking website, handling anti‑scraping measures with custom headers and fake user agents, and saves the collected data into a Word document for future use.

AutomationPythonWeb Scraping

0 likes · 6 min read

How to Scrape Recipes from XiaChuFang with Python: A Step‑by‑Step Guide

Python Programming Learning Circle

Jun 6, 2020 · Information Security

Understanding CSS Sprites and Techniques to Bypass Sprite‑Based Anti‑Scraping

This article explains the concept and benefits of CSS sprites, analyzes their drawbacks for web performance and security, and provides a step‑by‑step Python‑based method—including code snippets—to extract and sum numbers hidden behind sprite images used as an anti‑scraping measure.

Front-endSpriteWeb Scraping

0 likes · 9 min read

Understanding CSS Sprites and Techniques to Bypass Sprite‑Based Anti‑Scraping

Python Crawling & Data Mining

Jun 5, 2020 · Backend Development

Build a Python Image Scraper for 51miz.com in Minutes

This tutorial walks you through creating a Python web scraper that fetches image URLs from 51miz.com using requests and lxml, filters them with regular expressions, downloads the images, and demonstrates the complete workflow with code snippets and screenshots.

PythonWeb ScrapingXPath

0 likes · 5 min read

Build a Python Image Scraper for 51miz.com in Minutes

Python Crawling & Data Mining

Jun 1, 2020 · Backend Development

Scrape Lianjia Real‑Estate Listings with Python and Export to Word

This guide walks through building a Python web‑scraper that fetches house names, prices and popularity from Lianjia, formats the data into a Word template, and generates individual Word documents while demonstrating key libraries, URL handling, HTML parsing, and best‑practice considerations.

LianjiaPythonWeb Scraping

0 likes · 6 min read

Scrape Lianjia Real‑Estate Listings with Python and Export to Word

Python Programming Learning Circle

May 30, 2020 · Backend Development

Analyzing Bilibili Comment API and Building a Search Tool

This article describes how to discover Bilibili's comment XML APIs, decode the embedded CRC32‑hashed user IDs, choose an appropriate database schema, and implement a Python‑PHP tool that retrieves, filters, and displays comments based on video CID and keyword.

APIBilibiliCRC32

0 likes · 6 min read

Analyzing Bilibili Comment API and Building a Search Tool

Python Crawling & Data Mining

May 28, 2020 · Backend Development

Multithreaded Python Crawl of Xiaomi App Store Games

This tutorial demonstrates how to use Python's requests, threading, and queue modules to build a multithreaded crawler that extracts game names, download links, and execution time from the Xiaomi App Store, complete with code examples and performance tips.

PythonWeb ScrapingXiaomi App Store

0 likes · 7 min read

Multithreaded Python Crawl of Xiaomi App Store Games

Python Crawling & Data Mining

May 27, 2020 · Fundamentals

Master Python Regular Expressions: From Basics to Real-World Scraping

This article introduces Python's re module, explains core regex functions such as match, search, replace, and compile, demonstrates practical code examples with images, and shows how to apply regular expressions for web‑scraping tasks.

Pattern MatchingPythonWeb Scraping

0 likes · 9 min read

Master Python Regular Expressions: From Basics to Real-World Scraping

Liangxu Linux

May 26, 2020 · Frontend Development

How to Bypass Copy Restrictions and Extract Text from Web Pages

This guide explains several techniques—including using browser developer tools, console commands, and a Windows utility—to copy protected text from websites and download documents like Baidu Docs, while noting their limitations and required steps.

Baidu DocsBrowser ExtensionWeb Scraping

0 likes · 6 min read

How to Bypass Copy Restrictions and Extract Text from Web Pages

Python Programming Learning Circle

May 19, 2020 · Game Development

Creating a Photo Mosaic from League of Legends Skin Images Using Python Web Scraping

This tutorial explains how to crawl all League of Legends hero skin images with a Python script, decode the URL pattern, download the assets, and then assemble them into a large photo mosaic using a third‑party mosaic software, providing full code and step‑by‑step instructions.

League of LegendsWeb Scrapingimage-mosaic

0 likes · 9 min read

Creating a Photo Mosaic from League of Legends Skin Images Using Python Web Scraping

Python Crawling & Data Mining

May 18, 2020 · Backend Development

How to Scrape Images and Videos from Baidu Tieba Using Python

This tutorial explains how to build a Python web‑scraper that searches Baidu Tieba by keyword, bypasses anti‑crawling measures, extracts image and video URLs with XPath, and saves the media files locally, complete with code examples and setup instructions.

Baidu TiebaPythonWeb Scraping

0 likes · 8 min read

How to Scrape Images and Videos from Baidu Tieba Using Python

Python Crawling & Data Mining

May 10, 2020 · Big Data

Uncovering Viewer Sentiment on "Dragon Tomb Cave" – A Data‑Driven Douban Analysis

This article examines Douban ratings and 500 user comments for the Chinese web drama "Dragon Tomb Cave", visualizes rating distribution, comment timing, geographic origins, and compares sentiment with other series using Python web‑scraping and data‑visualization techniques.

Chinese dramaData VisualizationDouban analysis

0 likes · 8 min read

Uncovering Viewer Sentiment on "Dragon Tomb Cave" – A Data‑Driven Douban Analysis

Python Crawling & Data Mining

May 7, 2020 · Backend Development

Download Any Online Video with One Python Command: A Step‑by‑Step Guide

This tutorial shows how to use the Python‑based you‑get tool to download videos, images, and music from dozens of popular sites with a single command, adjust format and quality, and install the utility via pip, Git, or Homebrew.

Web Scrapingcommand-linevideo download

0 likes · 5 min read

Download Any Online Video with One Python Command: A Step‑by‑Step Guide

Python Crawling & Data Mining

May 3, 2020 · Backend Development

How to Build a Python Web Scraper for Downloading Movies Step‑by‑Step

This guide walks you through setting up a Python environment, installing required libraries, writing a FilmSky class with request handling, parsing HTML using regular expressions, and saving movie titles and download links, providing a practical example of web crawling for movie sites.

PythonWeb Scrapingbackend

0 likes · 6 min read

How to Build a Python Web Scraper for Downloading Movies Step‑by‑Step

Python Crawling & Data Mining

Apr 29, 2020 · Backend Development

How to Batch Download Images with Python: From XPath Extraction to Automated Saving

This tutorial walks you through extracting image URLs from a webpage using XPath, constructing full URLs, and automating batch downloads with Python's requests, lxml, and fake_useragent libraries, including code snippets and practical tips for handling files and headers.

Image DownloadPythonWeb Scraping

0 likes · 6 min read

How to Batch Download Images with Python: From XPath Extraction to Automated Saving

Full-Stack Internet Architecture

Apr 26, 2020 · Backend Development

Scrapy Tutorial: Installation, Components, Project Setup, Code Implementation, and Data Storage

This article provides a comprehensive step‑by‑step guide to installing Scrapy, understanding its core components and processing flow, creating a weather‑data crawling project, writing items, settings, middlewares, spiders, running the crawler, exporting results, and storing the scraped data into MongoDB.

CrawlerMongoDBPython

0 likes · 15 min read

Scrapy Tutorial: Installation, Components, Project Setup, Code Implementation, and Data Storage

Python Crawling & Data Mining

Apr 24, 2020 · Backend Development

Can You Predict Stock Bottoms? A Python Scraper Analyzes Fund Success Rates

This article explores the probability of successful bottom‑buying in the stock market by building a data model, using Python to scrape fund data, analyzing individual and multiple funds, and visualizing results to reveal that only about one in four attempts succeed.

Web Scrapingfinancefunds

0 likes · 5 min read

Can You Predict Stock Bottoms? A Python Scraper Analyzes Fund Success Rates

Python Crawling & Data Mining

Apr 21, 2020 · Big Data

Master Web Scraper: Build Complete Chrome-Based Crawlers Without Code

Learn how to install the Web Scraper Chrome extension, configure sitemaps and selectors, and handle pagination, scrolling, and click‑load scenarios to build robust, multi‑level web crawling projects without writing code, covering selector types, multi‑page navigation, and data export to CSV.

Chrome ExtensionWeb Scrapingdata extraction

0 likes · 13 min read

Master Web Scraper: Build Complete Chrome-Based Crawlers Without Code

360 Tech Engineering

Mar 30, 2020 · Backend Development

Using Go and chromedp to Control Headless Chrome for Screenshots, PDF Export, and Device Emulation

This article demonstrates how to use Go with the chromedp library to drive headless Chrome for element screenshots, full‑page captures, PDF generation, and device emulation, offering a lightweight alternative to Selenium‑based solutions.

Headless ChromePDF exportWeb Scraping

0 likes · 9 min read

Using Go and chromedp to Control Headless Chrome for Screenshots, PDF Export, and Device Emulation

Python Programming Learning Circle

Mar 27, 2020 · Backend Development

Accessing Login‑Protected Pages with Cookies, urllib, requests, and Selenium in Python

This guide explains four practical methods—using a known cookie, simulating login with urllib or requests, maintaining a session, and employing a headless Selenium browser—to programmatically retrieve pages that require user authentication, complete with step‑by‑step instructions and code examples.

HTTPSeleniumWeb Scraping

0 likes · 14 min read

Accessing Login‑Protected Pages with Cookies, urllib, requests, and Selenium in Python

Python Programming Learning Circle

Mar 24, 2020 · Big Data

Analyzing Jianshu Platform Data with Python Crawling and FineBI Visualization

This article details how to use Python to crawl user and article data from the Jianshu platform, then apply FineBI for business intelligence analysis and visualizations, covering author contracts, follower distribution, popular articles, and engagement metrics.

BIData VisualizationFineBI

0 likes · 6 min read

Analyzing Jianshu Platform Data with Python Crawling and FineBI Visualization

TAL Education Technology

Mar 6, 2020 · Backend Development

DIY Technical News Acquisition: Framework, Practices, and Code Samples

This article explains why personalized tech‑news gathering is valuable, proposes a DIY framework for controlling sources, collection, filtering, reading experience and iteration, and demonstrates three concrete Node.js scraping examples—HTML pages, API data, and WeChat public accounts—plus extended thoughts on building a simple product.

Node.jsPuppeteerWeb Scraping

0 likes · 17 min read

DIY Technical News Acquisition: Framework, Practices, and Code Samples

Python Crawling & Data Mining

Mar 6, 2020 · Big Data

Bypass Anti‑Scraping Limits with Free Proxy IPs in Python

This tutorial explains how to obtain free proxy IPs, extract their addresses using Python's requests and BeautifulSoup, and continuously validate them to overcome anti‑scraping restrictions when crawling sites such as Baidu Baike for data mining tasks.

PythonWeb Scrapingdata mining

0 likes · 5 min read

Bypass Anti‑Scraping Limits with Free Proxy IPs in Python

Python Programming Learning Circle

Feb 24, 2020 · Fundamentals

Beginner's Python Web Crawler for High‑Resolution Game Skins and Hero Background Stories

This tutorial introduces a lightweight Python web crawler that, without relying on third‑party scraping frameworks, fetches high‑resolution skin images and all hero background stories from a popular game, explaining required libraries, implementation steps, and showing the resulting outputs.

Web Scrapingbeginner tutorialgame data

0 likes · 3 min read

Beginner's Python Web Crawler for High‑Resolution Game Skins and Hero Background Stories

Python Programming Learning Circle

Feb 21, 2020 · Backend Development

Introduction to Python Web Scraping: Basics, HTTP/HTTPS, Requests Library, Proxies, and Data Extraction

This article provides a comprehensive introduction to Python web scraping, covering the fundamental concepts of spiders, HTTP/HTTPS protocols, the Requests library usage, custom headers, proxies, cookies, and various data extraction techniques such as JSON parsing, XPath, and regular expressions.

HTTPWeb Scrapingdata extraction

0 likes · 9 min read

Introduction to Python Web Scraping: Basics, HTTP/HTTPS, Requests Library, Proxies, and Data Extraction

21CTO

Feb 10, 2020 · Backend Development

Top 8 PHP Libraries for Efficient Web Scraping

This article reviews eight PHP web‑scraping libraries—Goutte, Simple HTML DOM, htmlSQL, cURL, Request, HTTPful, Buzz, and Guzzle—detailing their features, requirements, licensing, and documentation to help developers choose the right tool for their backend data‑extraction projects.

GoutteGuzzlePHP

0 likes · 9 min read

Top 8 PHP Libraries for Efficient Web Scraping

MaGe Linux Operations

Jan 3, 2020 · Backend Development

Master Web Scraping with Scrapy: A Complete Python Guide

This guide introduces Scrapy, a powerful Python web‑scraping framework, explains its architecture and components, walks through installation, project creation, spider development, query syntax, recursive crawling, and item pipelines, providing practical code examples for building robust crawlers.

CrawlerPythonScrapy

0 likes · 8 min read

Master Web Scraping with Scrapy: A Complete Python Guide

Python Programming Learning Circle

Dec 28, 2019 · Backend Development

How to Build a Self‑Healing Dynamic Proxy Pool with Scrapy and Redis

This article explains how to build a self‑healing dynamic proxy pool for 24/7 web crawling using Scrapy and Redis, covering requirements, design, implementation details, deployment steps, and a reusable Scrapy middleware example.

RedisScrapyWeb Scraping

0 likes · 7 min read

How to Build a Self‑Healing Dynamic Proxy Pool with Scrapy and Redis

Python Programming Learning Circle

Dec 27, 2019 · Backend Development

How to Bypass Anti‑Scraping Measures: Delays, Headers, Proxies & Distributed Crawling

This guide explains practical techniques to avoid IP bans and 403 errors when web‑scraping, covering explicit and implicit waiting, User‑Agent spoofing, proxy usage, IP pools, and distributed crawling architectures.

PythonSeleniumWeb Scraping

0 likes · 8 min read

How to Bypass Anti‑Scraping Measures: Delays, Headers, Proxies & Distributed Crawling

Liangxu Linux

Dec 25, 2019 · Backend Development

How Python Bots Beat 12306 Ticket Crashes: Open‑Source Tools & Features

When the Chinese railway ticketing system 12306 crashes under heavy load, developers turn to open‑source Python bots that simulate user behavior, query seat availability, and automate order submission, with detailed feature lists, repository links, and real‑world log examples.

12306AutomationGitHub

0 likes · 9 min read

How Python Bots Beat 12306 Ticket Crashes: Open‑Source Tools & Features

Python Programming Learning Circle

Dec 24, 2019 · Backend Development

How to Scrape Chinese Classic Novels with Python: A Step‑by‑Step Guide

This tutorial walks you through planning, extracting, and saving classic Chinese novel content from shicimingju.com using Python, regular expressions, and file storage, providing clear code examples and practical tips for successful web scraping.

PythonWeb Scrapingfile storage

0 likes · 5 min read

How to Scrape Chinese Classic Novels with Python: A Step‑by‑Step Guide

MaGe Linux Operations

Dec 9, 2019 · Backend Development

Master Python’s Requests Library: Essential HTTP Techniques for Web Scraping

This guide introduces Python’s Requests library, covering installation, GET and POST requests, handling headers, response codes, cookies, redirects, timeouts, and proxy settings, with practical code examples to help developers perform reliable HTTP operations and avoid common pitfalls in web scraping.

APIHeadersPython

0 likes · 6 min read

Master Python’s Requests Library: Essential HTTP Techniques for Web Scraping

21CTO

Dec 3, 2019 · Information Security

When Is Web Scraping Legal? A Developer’s Guide to Chinese Cyber Laws

This article explains the legal boundaries of web crawling in China, covering recent cybersecurity regulations, what makes a crawler illegal or legal, common developer questions, and practical advice to avoid personal‑data violations and criminal liability.

Chinese lawWeb Scrapingcrawler ethics

0 likes · 10 min read

When Is Web Scraping Legal? A Developer’s Guide to Chinese Cyber Laws

FunTester

Nov 14, 2019 · Backend Development

Web Scraping CBA Match Data with Java: Methodology and Full Code Example

This article explains how to scrape Chinese Basketball Association (CBA) match data from a portal website, analyzes the page structure, extracts table rows using regular expressions, converts them to CSV format, and provides a complete Java/Groovy code example for automated data collection.

CBACSVJava

0 likes · 8 min read

Web Scraping CBA Match Data with Java: Methodology and Full Code Example

Python Programming Learning Circle

Nov 10, 2019 · Fundamentals

7 Fun Python Projects You Can Build in Minutes

This article presents seven practical Python scripts—from scraping Zhihu images and chatting bots to poetry author detection, lottery generation, auto‑drafting apologies, screen recording, and GIF creation—showcasing how to quickly automate diverse tasks without reinventing the wheel.

AIAutomationCode examples

0 likes · 9 min read

7 Fun Python Projects You Can Build in Minutes

FunTester

Oct 22, 2019 · Backend Development

How to Scrape 7.2 Million Historical Weather Records with Groovy

This article explains how to use a Groovy script to crawl over 7 million historical weather entries for 3,200 cities spanning 2011‑2019, process the JSON responses, and store the cleaned data into a MySQL table, while sharing practical tips and code snippets.

Data EngineeringGroovyJava

0 likes · 7 min read

How to Scrape 7.2 Million Historical Weather Records with Groovy

Python Programming Learning Circle

Oct 19, 2019 · Backend Development

How to Bypass Anti‑Scraping Measures: User‑Agent, Cookies & Proxies

This guide explains practical techniques such as faking User‑Agent headers, rotating cookies, adding random delays, and using proxy pools to prevent IP bans while crawling large amounts of data from websites with anti‑scraping defenses.

User-AgentWeb Scrapinganti-scraping

0 likes · 4 min read

How to Bypass Anti‑Scraping Measures: User‑Agent, Cookies & Proxies

MaGe Linux Operations

Oct 8, 2019 · Backend Development

How to Build a Python Zhihu Crawler: Login, User Data, and Answer Likes

This guide walks through using Python's requests and BeautifulSoup libraries to simulate Zhihu login, extract user profiles, retrieve answer likers, download avatars, fetch all answers for a question, and store the collected data in a SQLite database.

PythonSQLiteWeb Scraping

0 likes · 9 min read

How to Build a Python Zhihu Crawler: Login, User Data, and Answer Likes

MaGe Linux Operations

Oct 5, 2019 · Fundamentals

How to Scrape and Analyze Holiday Tourist Spot Data with Python

This tutorial walks you through using Python to collect tourism data from Qunar, extract key fields such as name, price, and rating, store the results in Excel with pandas, and visualize sales and popularity trends using pyecharts, including a simple recommendation algorithm.

PyechartsPythonTourism

0 likes · 8 min read

How to Scrape and Analyze Holiday Tourist Spot Data with Python

政采云技术

Sep 29, 2019 · Frontend Development

Puppeteer: Automating Web Performance Analysis and Scraping

This article introduces Puppeteer, a Node library for controlling Chrome, and demonstrates its use in automating web performance analysis and scraping tasks.

AutomationJavaScriptPuppeteer

0 likes · 13 min read

Puppeteer: Automating Web Performance Analysis and Scraping

Efficient Ops

Sep 29, 2019 · Backend Development

How to Scrape and Visualize 6,000+ Chinese Tourist Spots with Selenium and Python

This article demonstrates how to use Selenium and Python to crawl over 6,000 Chinese tourist attractions from Qunar, extract ratings, popularity and sales data, and visualize the results with pandas, seaborn, matplotlib, and pyecharts, revealing the most visited sites and regional travel trends during the 2019 National Day holiday.

Data VisualizationPandasPython

0 likes · 9 min read

How to Scrape and Visualize 6,000+ Chinese Tourist Spots with Selenium and Python

21CTO

Sep 28, 2019 · Backend Development

Cracking Dazhong Dianping’s CSS Encryption: A Step‑by‑Step Web Scraping Guide

This article walks through the challenges of scraping Dazhong Dianping, explains how the site hides numeric data with custom CSS fonts, and provides a complete Python workflow—including HTTP requests, font extraction, glyph rendering, and OCR—to decode and retrieve the protected information.

CSS encryptionOCRPython

0 likes · 13 min read

Cracking Dazhong Dianping’s CSS Encryption: A Step‑by‑Step Web Scraping Guide

MaGe Linux Operations

Sep 19, 2019 · Fundamentals

Master XPath: Essential Node Selection, Predicates, Axes, and Functions

This guide explains core XPath concepts—including node selection shortcuts, predicate filters, wildcard usage, multiple path unions, axis navigation, and built‑in functions—providing clear examples for effective XML/HTML data extraction.

SelectorsWeb ScrapingXML

0 likes · 6 min read

Master XPath: Essential Node Selection, Predicates, Axes, and Functions

FunTester

Sep 17, 2019 · Backend Development

Building a Multithreaded Java Web Scraper to Harvest 100k Records

After uncovering an unprotected API that allowed unlimited resource access, the author created a rough Java program that uses a fixed-size thread pool and CountDownLatch to fetch 100 000 items in parallel, retrieving 10 000 records per thread via HTTP GET requests.

HTTPJavaWeb Scraping

0 likes · 6 min read

Building a Multithreaded Java Web Scraper to Harvest 100k Records

FunTester

Sep 12, 2019 · Backend Development

Scraping HTML Tables with Java Regex and Generating SQL Inserts

The article walks through a Java solution for extracting multilingual data from an HTML table using regular expressions, handling spacing and encoding issues, splitting fields, and constructing INSERT statements to populate a country_code database table.

JavaSQLWeb Scraping

0 likes · 6 min read

Scraping HTML Tables with Java Regex and Generating SQL Inserts

MaGe Linux Operations

Sep 8, 2019 · Backend Development

How to Crawl Taobao Product Data with Python: From Login to Excel Export

This tutorial walks you through logging into Taobao with Python requests, handling anti‑scraping measures, extracting product information via the PC search API, parsing JSON data, and saving the results to Excel, while also covering common pitfalls like sliders and proxy management.

PythonTaobaoWeb Scraping

0 likes · 9 min read

How to Crawl Taobao Product Data with Python: From Login to Excel Export

MaGe Linux Operations

Jul 19, 2019 · Backend Development

How to Scrape High‑Resolution Images from ColorHub with Python

Learn a step‑by‑step Python solution to locate, download, and store high‑resolution, royalty‑free images from ColorHub by navigating its three‑tier page structure, generating request headers, parsing HTML with BeautifulSoup, and saving files locally, enabling offline PPT creation without copyright concerns.

AutomationImage DownloadPython

0 likes · 5 min read

How to Scrape High‑Resolution Images from ColorHub with Python

MaGe Linux Operations

Jul 2, 2019 · Backend Development

Master Web Scraping with BeautifulSoup: A Complete Python Guide

This tutorial introduces BeautifulSoup, a powerful Python library for parsing HTML and XML, covering installation, basic usage, tag selection, attribute extraction, navigation of parent and sibling nodes, method and CSS selectors, and best‑practice recommendations for efficient web data extraction.

ParsingPythonWeb Scraping

0 likes · 30 min read

Master Web Scraping with BeautifulSoup: A Complete Python Guide

Programmer DD

Jun 26, 2019 · Information Security

How to Build a GitHub Code Leak Detector with Python – Real‑World Security Monitoring

This tutorial walks you through creating a Python‑based GitHub monitoring tool that logs in, crawls code search results for sensitive keywords, extracts repository details, writes findings to CSV, and sends email alerts, providing a practical approach to detecting accidental source‑code leaks.

AutomationWeb Scrapingemail-alert

0 likes · 11 min read

How to Build a GitHub Code Leak Detector with Python – Real‑World Security Monitoring

Youzan Coder

Jun 5, 2019 · Backend Development

Building a Poster Rendering Service with Puppeteer

The article explains how to build a poster‑rendering service with Puppeteer, detailing its advantages over canvas, the Redis‑based caching and CDN workflow, optimization tricks for headless Chromium, and future plans to boost QPS and pre‑generate popular posters.

CDNCanvas APIPuppeteer

0 likes · 9 min read

Building a Poster Rendering Service with Puppeteer

360 Tech Engineering

May 20, 2019 · Fundamentals

A Data‑Driven Guide to Finding a Partner: From Crawling Zhihu Answers to Ranking Candidates

This article walks through a complete data‑analysis workflow—scraping Zhihu dating‑preference answers, cleaning and filtering the data, deriving gender and activity metrics, designing a four‑step screening process, and finally ranking candidates with a custom like‑to‑comment index—to help a single programmer create a concise, high‑quality list of potential partners.

MetricsRankingWeb Scraping

0 likes · 9 min read

A Data‑Driven Guide to Finding a Partner: From Crawling Zhihu Answers to Ranking Candidates

MaGe Linux Operations

May 6, 2019 · Big Data

How to Scrape Python Job Listings and Visualize Trends with pyecharts

This article walks through collecting Python job postings from Lagou by handling anti‑scraping measures, parsing POST requests, storing results in Excel, and then using pyecharts to create bar, map, and pie visualizations that reveal city distribution, salary ranges, and experience requirements.

Job marketPyechartsPython

0 likes · 13 min read

How to Scrape Python Job Listings and Visualize Trends with pyecharts

Tencent Cloud Developer

Mar 26, 2019 · Mobile Development

Building a WeChat Mini Program with Taro and Cloud Development: A Japanese Sentence Helper Case Study

The article explains how to create a WeChat Mini Program backend with Tencent Cloud development, use the React‑based Taro framework to build a Japanese sentence helper, consolidate multiple cloud functions via tcb-router, and scrape example sentences with superagent and cheerio, highlighting setup tips and known limitations.

ReActSuperagentTaRO

0 likes · 7 min read

Building a WeChat Mini Program with Taro and Cloud Development: A Japanese Sentence Helper Case Study

MaGe Linux Operations

Mar 15, 2019 · Backend Development

How to Scrape Meituan Takeout App Comments with Python and MongoDB

This tutorial explains how to extract Meituan Takeout app comments by analyzing Ajax requests, constructing dynamic URLs, parsing JSON with regular expressions, and storing the results in text files and MongoDB using Python.

AJAXMeituanMongoDB

0 likes · 7 min read

How to Scrape Meituan Takeout App Comments with Python and MongoDB

Python Crawling & Data Mining

Mar 12, 2019 · Backend Development

How to Fix the "No module named win32api" Error in Scrapy on Windows

This guide explains why Scrapy on Windows raises the "No module named win32api" error, walks through installing the correct pywin32 package (or pypiwin32), shows how to obtain the proper wheel from an unofficial source, and provides extra tips for locating Scrapy spider names.

PythonScrapyWeb Scraping

0 likes · 5 min read

How to Fix the "No module named win32api" Error in Scrapy on Windows

Efficient Ops

Jan 21, 2019 · Big Data

Scraping and Visualizing China’s Tourist Spot Data: From Web Crawl to Insights

This article details a complete workflow for extracting nationwide tourist attraction data from Qunar, cleaning and enriching it with geographic coordinates, and performing multi‑level statistical analysis and visualizations—including sales rankings, popularity metrics, heatmaps, and word clouds—to reveal regional tourism patterns across China.

Data VisualizationGeocodingTourism Data

0 likes · 15 min read

Scraping and Visualizing China’s Tourist Spot Data: From Web Crawl to Insights

MaGe Linux Operations

Jan 18, 2019 · Backend Development

How to Crawl and Analyze Bilibili Video Comments with Async Python

This tutorial demonstrates how to use asynchronous Python (aiohttp, motor, asyncio) to scrape all comments from a popular Bilibili video, store them in MongoDB, and generate a Chinese word‑cloud analysis to reveal why the video went viral.

Web Scrapingaiohttpasyncio

0 likes · 9 min read

How to Crawl and Analyze Bilibili Video Comments with Async Python

MaGe Linux Operations

Jan 14, 2019 · Backend Development

How to Build a Scrapy Spider to Crawl AutoHome Car Data in Python

This article walks through building a Python Scrapy spider to extract comprehensive car brand, series, and model data from Autohome, covering environment setup, project initialization, spider and item definitions, handling lazy-loaded pages, CSV output configuration, rate limiting, user‑agent rotation, and debugging tips.

AutohomeCar DataScrapy

0 likes · 10 min read

How to Build a Scrapy Spider to Crawl AutoHome Car Data in Python

MaGe Linux Operations

Jan 2, 2019 · Fundamentals

Master XPath: Essential Node Selection, Predicates, and Functions for Web Scraping

This guide explains the core XPath syntax—including node selectors, predicates, wildcards, multiple paths, axes, and built‑in functions—providing clear examples so you can efficiently locate and filter XML/HTML elements when scraping web pages.

SelectorsWeb ScrapingXML

0 likes · 6 min read

Master XPath: Essential Node Selection, Predicates, and Functions for Web Scraping

MaGe Linux Operations

Dec 16, 2018 · Big Data

What Happens When Most Language Learners Quit? A Data‑Driven Dive into Shanbay Users

Using Python’s Scrapy, pandas, and seaborn, the author scraped and cleaned public Shanbay user data, stored it in PostgreSQL, and analyzed registration and study habits to reveal that over 68% of users abandon word‑learning on day one, with only a tiny fraction persisting beyond 100 days.

ShanbayWeb Scrapingdata analysis

0 likes · 9 min read

What Happens When Most Language Learners Quit? A Data‑Driven Dive into Shanbay Users