CeWL 5.0 Adds Proxy and Authentication Support for Custom Wordlist Generation
CeWL 5.0, the Ruby‑based custom wordlist generator, now supports proxy configuration and HTTP Basic/Digest authentication, enabling penetration testers to crawl credential‑protected internal sites and extract targeted vocabulary for password‑cracking tools such as John the Ripper.
CeWL (Custom Word List generator) is a Ruby application used by red‑team operators to harvest text from target websites and build tailored password dictionaries. Its primary value lies in collecting site‑specific terms—product names, personal names, technical jargon—that are unlikely to appear in generic wordlists but are often used in real passwords, and the output can be fed directly to cracking tools such as John the Ripper.
The most significant change in version 5.0 is proxy support. In internal‑network penetration scenarios many targets are reachable only through a proxy, which earlier versions could not handle. The new command line options allow a single line to specify the proxy host, port, and optional credentials:
cewl --proxy_host <proxy_address> --proxy_port <port> \
--proxy_username <user> --proxy_password <pass> URLIf the proxy requires authentication, the username and password parameters are accepted as shown.
Version 5.0 also adds HTTP Basic and Digest authentication support, enabling crawling of sites that require login before content can be accessed:
cewl --auth_type basic --auth_user <user> --auth_pass <pass> URLThe supported authentication types are basic and digest, which means internal systems protected by these mechanisms can now be scraped and turned into wordlists.
Additional updates include a code refactor contributed by g0tmi1k, improved internationalisation for non‑ASCII sites, enhanced JavaScript content extraction to pull more terms from scripts, and several bug fixes such as regex matching and internal link handling.
In internal‑network penetration testing, teams often encounter HR, OA, or asset‑management systems that are not exposed to the Internet and require login pages built with common frameworks. These systems have unique vocabularies but lack ready‑made dictionaries. CeWL 5.0’s authentication support resolves this gap: with valid credentials, the tool can harvest the system’s own terminology and turn it into an effective attack vector.
Download the tool from its GitHub repository: https://github.com/digininja/CeWL.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Black & White Path
We are the beacon of the cyber world, a stepping stone on the road to security.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
