Big Data 18 min read

What Do Python Jobs Really Pay? Inside a Data‑Driven Salary & Skill Analysis

This article crawls Lagou.com to collect 4,500 Python‑related job postings across ten roles, extracts salary, education, experience and skill requirements, visualizes the data with treemaps, rose charts, bar charts and word clouds, and provides detailed insights into each position’s market demands and compensation trends.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
What Do Python Jobs Really Pay? Inside a Data‑Driven Salary & Skill Analysis

This article presents a comprehensive data‑driven analysis of Python‑related job positions on Lagou.com. It covers ten roles—Python crawler, data analyst, backend developer, full‑stack developer, operations engineer, big‑data engineer, machine‑learning engineer, architect, senior developer, and more—by crawling 450 listings per role (total 4,500 entries) and extracting salary, education, work‑year requirements and key skills.

Data Collection

The listings are loaded dynamically; the author inspected the network traffic and discovered that the job data is returned as JSON via a POST request to https://www.lagou.com/jobs/positionAjax.json. The script sends the request with appropriate headers (including a user‑agent and cookie), iterates over pages 1‑30, and stores the fields positionId, salary, education and workYear into a CSV file.

def file_do(list_info):
    # write list_info to CSV (creates header if file is empty)
    ...

def get_info(headers):
    for i in range(1,31):
        data = {'first':'true','kd':'Python爬虫','pn':i}
        req_result = requests.post(req_url, data=data, headers=headers)
        req_info = req_result.json()['content']['positionResult']['result']
        for j in range(len(req_info)):
            salary = req_info[j]['salary']
            education = req_info[j]['education']
            workYear = req_info[j]['workYear']
            positionId = req_info[j]['positionId']
            list_one = [positionId, salary, education, workYear]
            list_info.append(list_one)
        file_do(list_info)
        time.sleep(1.5)

After gathering the basic information, the script reads the CSV, reconstructs each job’s detail URL using positionId, fetches the job description page, and extracts the duty and requirement sections, cleaning the text with regular expressions.

def get_info():
    for position_url in position_urls:
        response = get_response(position_url, headers=headers)
        content = response.xpath('//*[@id="job_detail"]/dd[2]/div/p/text()')
        # clean and separate duty and requirement
        ...
        write_file(work_duty)
        write_file2(work_requirement)

Visualization

The cleaned data is visualized using pyecharts :

Treemap for education requirements.

Rose pie chart for salary distribution.

Bar chart for work‑experience distribution.

Word cloud for required skills.

Sample code for the treemap:

from pyecharts import TreeMap
education_table = {}
for x in education:
    education_table[x] = education.count(x)
# build data list
# ...
TreeMap("矩形树图", width=1200, height=600).add("学历要求", data, is_label_show=True, label_pos='inside')

Key Findings per Role

Python Crawler – Mostly requires a bachelor’s degree, salary 10k‑30k, and 1‑5 years experience. Skills focus on distributed systems, multithreading, Scrapy, algorithms and databases.

Python Data Analyst – Similar education (bachelor) but a higher proportion of master’s degrees. Salary 10k‑30k, experience 1‑5 years. Required skills include SAS, SPSS, Hadoop, Hive, Excel and statistics.

Python Backend – Bachelor’s degree, salary 10k‑30k, experience 3‑5 years. Must know Linux, MySQL, Redis, MongoDB, and frameworks such as Flask, Django, Tornado.

Python Full‑Stack – Bachelor’s degree, salary 10k‑30k, experience 3‑5 years. Needs testing, DevOps, development, data structures, algorithms, API design, virtualization and front‑end basics.

Python Operations Engineer – Mostly bachelor or associate degree, salary 10k‑30k, experience 3‑5 years. Skills span SVN/Git, Linux, shell scripting, MySQL, Redis, Ansible and front‑end frameworks.

Python Senior Developer – Bachelor’s degree, salary around 20k, experience 3‑5 years. Expected to handle web back‑end, MySQL, MongoDB, Redis, Linux, CI/CD and GitHub.

Python Big‑Data Engineer – Bachelor (many masters), salary 20k‑40k, experience 3‑5 years. Skills include Hadoop, Spark, Hive, HBase, MySQL, MongoDB, Redis, Flask, Celery and Nginx.

Python Machine‑Learning Engineer – Bachelor (large master share), salary 30k‑40k+, experience 3‑5 years. Required expertise: machine learning, data mining, algorithms, TensorFlow, Spark‑MLlib, Kafka/RabbitMQ.

Python Architect – Bachelor, salary 30k‑40k+, experience 5‑10 years. Must master Flask, Django, MySQL, Redis, MongoDB, Hadoop, Hive, Spark, ElasticSearch, Pandas, Kafka, etc.

Overall Conclusions

Across all Python roles, more than 90% of positions offer salaries above 10k, and about 60% exceed 20k. A bachelor’s degree is the baseline for most jobs, while senior or specialized roles (big‑data, ML, architect) see a higher proportion of master’s or even doctoral degrees. Experience requirements cluster around 3‑5 years, making the 24‑27 age range a prime window for attaining these positions. The analysis also highlights that “experience” and “familiarity” are the most frequently mentioned keywords in job descriptions, underscoring the industry’s emphasis on practical competence.

These insights can help job seekers understand salary expectations, required skill sets, and the education‑experience balance needed to secure a Python‑related position in today’s market.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Pythondata analysissalaryjob marketvisualizationSkills
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.