Information Security 9 min read

Analysis of Password Structures and Patterns in Web Penetration Testing

This article examines how manually assigned passwords in web services exhibit predictable structures—prefixes, keywords, separators, and suffixes—by analyzing millions of leaked Gmail passwords and other data, and categorizes the patterns to aid security assessments.

Architect
Architect
Architect
Analysis of Password Structures and Patterns in Web Penetration Testing

0x00 Research Scope

In web penetration testing, attackers often encounter passwords that are built around a core keyword (e.g., for mail, VPN, admin login).

0x01 Actual Data Analysis

Gmail 5 million plaintext passwords

Personal penetration case studies

Top 2 000 U.S. names

Most enterprise web services do not allow user‑self registration; passwords are assigned by a person or system, and the assignment is usually not random, which leads to the research topic of “splitting passwords”.

When passwords are assigned, they often follow a structure based on a keyword plus additional characters such as company name, date, or simple sequences like "123".

0x02 Password Structure

Prefix

Keyword

Separator

Suffix

1) Keyword

Example from drops.wooyun.org :

URL (sub‑domain, domain): keyword=[drop, wooyun, wooyun.org]

Domain registration info (email, name, date): keyword=[xssshell, fangxiaodun, 20100506]

Website content (title, meta‑keywords, footer): keyword=[WooYun, zhishiku, Security, exploits, hacker, 0day, pentest]

Common password keywords: keyword=[admin, manage, pass, 姓名top500]

A small script to collect webpage keywords (ineffective for Chinese sites) was used on www.megacorpone.com :

root@Md:wget www.megacorpone.com -O 1.html
root@Md:cat 1.html|tr ' ' '\n'|grep '^[0-9a-Z]*[0-9a-Z]$'|sort|uniq|

Existing tools such as cewl can also be used:

2) Prefix and Suffix

From the Gmail 5 million password dataset, using the email name as the keyword:

11 561 passwords use the username as a keyword (excluding cases where username equals password).

7 91 of those follow a "prefix + keyword" pattern.

2 308 follow "prefix + keyword + separator + suffix".

8 462 follow "keyword + suffix".

The "prefix + keyword + separator + suffix" pattern is the least common; a notable example is "oliver+keyword+@+gmail.com". Frequently used prefixes include "1" and "123"; common suffixes include "@gmail.com". Many enterprises use the pattern " firstname + separator + domain " as default passwords.

Types of Prefixes/Suffixes

Continuity – sequential characters

Repetition – repeated characters

Regularity – rules such as keyboard layout or semantic meaning

Compliance – satisfy system requirements

Single‑character prefixes are usually compliance‑driven. Users often choose meaningful or rule‑based prefixes to aid memorability. The character set consists of 10 digits, 26 letters, and 32 symbols (68 characters total):

1234567890
qwertyuiopasdfghjklzxcvbnm

Examples of keyboard‑related preferences: the "1" key is most convenient, so numeric prefixes often start with "1"; symbolic meanings such as "@" resembling "a" make them easy to remember.

~!@#$%^&*()-_=+[{]}\|;:'",<.>/?

Keyboard rule – requires holding Shift for many symbols (e.g., left pinky for "_", left thumb for "!")

Meaning rule – "@" looks like "a", used in email addresses for memorability

3) Separator

Separators may be absent or a single character:

@ – meaning rule

_ – compliance rule

& – keyboard rule

Prefixes and suffixes differ: "woshi" is better as a prefix, while "qwe" or "888" function as suffixes. Prefixes can be empty, as can separators and suffixes.

4) Combination Methods

Prefix + Keyword (prefixes often compliance‑ or meaning‑driven)

Prefix + Keyword + Separator + Suffix (keywords often meaning‑driven)

Keyword + Suffix (all types)

Keyword + Separator + Suffix (suffixes often keyboard‑rule or continuity‑driven)

0x02 Summary

All non‑random passwords are created for memorability; the core of modern password cracking is targeting users who employ structured, rule‑based passwords rather than truly weak passwords. Humans are lazy, so they embed memorable patterns into their passwords.

Note: This article forms the theoretical basis of my social‑engineering dictionary program; any contradictions with scientific theory should defer to established scientific principles.

Source: WooYun Knowledge Base – Original article

Information Securityweb securitykeyword extractionpenetration testingpassword analysispassword structure
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.