How to Build a Python Weibo Red Envelope Scraper – Step‑by‑Step Guide

This article walks through creating a Python 2.7 script that logs into Weibo, fetches red‑envelope lists, evaluates their value with a custom algorithm, and automatically claims them, covering required libraries, cookie handling, HTTP GET/POST functions, RSA encryption, and result logging.

21CTO
21CTO
21CTO
How to Build a Python Weibo Red Envelope Scraper – Step‑by‑Step Guide

Background

During the Chinese New Year, the author, a Python beginner, decided to write a script to crawl Weibo red envelopes using Python 2.7.

0x01 Outline

The author sketches the workflow and imports the necessary libraries:

import re
import urllib
import urllib2
import cookielib
import rsa  # external library, install via pip

Additional variables are declared:

sys.setdefaultencoding('utf-8')
luckyList = []  # list of red envelopes
lowest = 10      # minimum cash value to consider

0x02 Weibo Login

Login requires cookie handling with cookielib.CookieJar() and an opener that processes cookies:

cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
urllib2.install_opener(opener)

Two helper functions are defined for HTTP requests:

def getData(url):
    try:
        req = urllib2.Request(url)
        result = opener.open(req)
        text = result.read().decode('utf-8').encode('gbk', 'ignore')
        return text
    except Exception as e:
        print u'请求异常,url:' + url
        print e

def postData(url, data, header):
    try:
        data = urllib.urlencode(data)
        req = urllib2.Request(url, data, header)
        result = opener.open(req)
        return result.read()
    except Exception as e:
        print u'请求异常,url:' + url
        print e

The login function (code omitted for brevity) performs RSA encryption of the timestamp and public key, sends the login request, and stores the resulting cookies.

0x03 Claim Red Envelope

After a successful login, the script sends a request to http://huodong.weibo.com/aj_hongbao/getlucky with parameters ouid (red‑envelope ID) and share. If the server returns {"code":303403}, the script copies the original request headers (especially Referer) to avoid the permission error.

The response JSON is parsed; a code of 100000 indicates success, 90114 means the daily limit is reached.

0x04 Crawl Red‑Envelope List

The script fetches the red‑envelope ranking page, extracts each item from the info_wrap div using regular expressions, and builds a list containing the envelope URL, cash value, gift value, and number of recipients.

A simple weighting algorithm is applied:

weight = cash / (recipients + gift_value)

The list is sorted by this weight in descending order.

0x05 Determine Usability

For each envelope, the script checks whether a "抢红包" button exists and whether the highest recorded cash amount is acceptable. If the envelope passes the checks, it proceeds to claim it.

0x06 Final Steps

The main start function ties everything together: it logs in, optionally loads a cached luckyList.txt, fetches the latest list, sorts it, and iterates over the envelopes to claim them. Results are logged to text files via a custom log function.

The script ends with a simple command‑line interface that prompts for Weibo username, password, a cash threshold, and whether to use the cached list.

0x07 Summary

The author notes that the crawler works but has many improvement opportunities, such as batch login, better value calculation, and code optimization.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

HTTPRSARed EnvelopeWeibo
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.