What Do Beijing’s 2020 Points‑Based Household Registration Data Reveal? A Shell‑Powered Analysis
The article examines the 2020 Beijing points‑based household registration dataset, showing how to download the CSV, use shell commands like awk, sort, uniq, and grep to extract score, company, surname, given‑name, and age distributions, and highlights the top companies and common surnames.
Background
On the day the annual Beijing points‑based household registration (积分落户) results were announced for 2020, the official website of the Beijing Human Resources and Social Security Bureau released a public CSV file containing 6,032 records.
Data source and format
The CSV includes fields for name, birth date, employer, and points score. Each row represents one applicant.
Score distribution
Most applicants scored between 97 and 102 points. Small differences of 0.01 points can separate dozens of people (e.g., 98.17 points has 39 applicants, 98.16 points has 21).
➜ 积分落户2020数据分析 git:(master) ✗ awk '{print $5}' 10000.csv | sort | uniq -c | sort -nr -k 1 | head -n 10
98 97.50
84 97.25
80 97.33
73 97.17
72 97.21
67 98.50
66 98.00
61 97.46
57 98.46
54 97.13
➜ 积分落户2020数据分析 git:(master) ✗ awk '{print $5}' 10000.csv | sort | uniq -c | sort -nr -k 1 | grep 98.17
39 98.17
➜ 积分落户2020数据分析 git:(master) ✗ awk '{print $5}' 10000.csv | sort | uniq -c | sort -nr -k 1 | grep 98.16
21 98.16Top 10 companies receiving household‑registration slots
Using a simple pipeline of awk, sort, uniq, and head -n 10 yields the companies with the most slots.
➜ 首批积分落户 > grep 'unit' jifenluohu.json| cut -f2 -d: | sort | uniq -c | sort -nr -k 1 | head -n 10
137 "北京华为数字技术有限公司"
73 "中央电视台"
57 "北京首钢建设集团有限公司"
55 "百度在线网络技术(北京)有限公司"
48 "联想(北京)有限公司"
40 "北京外企人力资源服务有限公司"
40 "中国民生银行股份有限公司"
39 "国际商业机器(中国)投资有限公司"
29 "中国国际技术智力合作有限公司"
27 "华为技术有限公司北京研究所"The list shows Huawei leading again, followed by CCTV, while Baidu drops and Tencent rises.
Most common surnames among applicants
Extracting the first character of the name field and counting occurrences reveals that surnames like "张" and "王" dominate.
➜ 首批积分落户 > grep '"name":' jifenluohu.json| sed 's|"name": "||g' | sed 's| ||g' | cut -c 1 | sort | uniq -c | sort -nr -k 1 | head -n 10
541 张
531 王
462 李
376 刘
205 陈
193 杨
166 赵
132 孙
95 郭
95 徐Given‑name popularity
Counting full given names shows that names like "王鹏" appear most frequently.
➜ 积分落户2020数据分析 git:(master) ✗ awk '{print $2}' 10000.csv | sort | uniq -c | sort -nr -k 1 | head -n 10
9 王鹏
6 王伟
6 张颖
5 赵静
5 石磊
5 王琳
5 王燕
5 王涛
5 王勇
5 孙涛Age distribution
By extracting the birth year and computing the age (2020 minus birth year), the age range spans from 32 to 59 years, with the bulk between 38 and 47.
# Using awk on the CSV
awk '{print $3}' 10000.csv | cut -f1 -d"-" | awk '{print 2020-$1}' | sort | uniq -c
1 32
3 35
30 36
83 37
290 38
468 39
644 40
741 41
808 42
751 43
636 44
507 45
365 46
329 47
108 48
107 49
85 50
27 51
6 52
10 53
9 54
8 55
6 56
5 57
3 58
2 59Conclusion
The 2020 dataset confirms Huawei’s leading role in securing household‑registration slots, shows a narrow age window (mostly 38‑47), and highlights that common surnames remain the same as national trends. The analysis can be reproduced with simple shell one‑liners, and the full list of 6,032 records is available by replying “2020积分落户” to the original source.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
