Operations 13 min read

What 13,966 Ops Job Listings Reveal About Salary, Skills, and Hot Cities?

This article analyzes 13,966 Chinese operations engineering job postings collected from 51job, detailing scraping methods, data cleaning steps, and visualizations that uncover top hiring industries, city demand, salary ranges, required education, company size distribution, and keyword trends for the ops market.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
What 13,966 Ops Job Listings Reveal About Salary, Skills, and Hot Cities?

The author collected 13,966 operations (运维) job postings from 51job, using XPath for web scraping, Pandas for data cleaning, and Pyecharts for visualization.

1. Web Scraping

The scraper extracts job name, company name, location, salary, release date, experience, education, company type, size, and industry using XPath expressions.

# 1、岗位名称
job_name = dom.xpath('//div[@class="dw_table"]/div[@class="el"]//p/span/a[@target="_blank"]/@title')
# 2、公司名称
company_name = dom.xpath('//div[@class="dw_table"]/div[@class="el"]/span[@class="t2"]/a[@target="_blank"]/@title')
# ... (other fields omitted for brevity)

2. Data Cleaning

Data is loaded with pandas.read_csv, indexed, and duplicate records are removed. Columns are renamed, salary strings are parsed into numeric ranges, locations and company sizes are standardized, and education levels are extracted with regular expressions.

# Read data
import pandas as pd, numpy as np, re, jieba

df = pd.read_csv("only_yun_wei.csv", encoding="gbk", header=None)
# Set index and columns
df.index = range(len(df))
df.columns = ["岗位名","公司名","工作地点","工资","发布日期","经验与学历","公司类型","公司规模","行业","工作描述"]
# Remove duplicates
df.drop_duplicates(subset=["公司名","岗位名","工作地点"], inplace=True)
# Parse salary
def get_money_max_min(x):
    try:
        if x[-3] == "万":
            z = [float(i)*10000 for i in re.findall("[0-9]+\.?[0-9]*", x)]
        elif x[-3] == "千":
            z = [float(i)*1000 for i in re.findall("[0-9]+\.?[0-9]*", x)]
        if x[-1] == "年":
            z = [i/12 for i in z]
        return z
    except:
        return x

salary = job_info["工资"].apply(get_money_max_min)
job_info["最低工资"] = salary.str[0]
job_info["最高工资"] = salary.str[1]
job_info["工资水平"] = job_info[["最低工资","最高工资"]].mean(axis=1)

3. Data Visualization

Several visualizations illustrate the findings:

Top 10 hiring industries (e.g., computer software, internet, telecom).

Top 10 cities by job count (Beijing, Shanghai, Guangzhou, Shenzhen).

Provincial distribution of positions, highlighting Guangdong, Jiangsu, Shanghai, and Beijing.

Company size demand, showing 50‑500 employees as the most sought‑after range.

Average salaries for the top 10 positions, with DevOps, application ops, database ops, and Linux ops exceeding 10k RMB.

Education requirements, dominated by associate and bachelor degrees.

Word‑cloud of job‑posting keywords, emphasizing terms like "运维", "能力", "系统", "维护", "经验".

In summary, the analysis shows which industries, cities, and company sizes have the highest demand for operations engineers, the average salaries for key roles, the prevalent education requirements, and the most frequent keywords in job postings, providing valuable guidance for job seekers and recruiters in the ops field.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonOperationsjob marketWeb Scraping
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.