Master wget: Powerful Command-Line Download Techniques for Linux
This guide explains how to use the Linux wget utility for reliable, automated file downloads, covering basic syntax, essential options for background execution, recursive mirroring, bandwidth limiting, proxy handling, logging, and a dozen practical examples that demonstrate common scenarios such as resuming interrupted transfers and downloading entire websites.
wget is a command‑line tool on Linux for downloading files over HTTP, HTTPS, and FTP, supporting proxies and capable of running in the background after the user logs out, which makes it far more convenient than graphical browsers for large or unattended downloads.
Command Syntax
wget [options] [URL]Key Features
Supports recursive downloading to recreate remote directory structures while respecting robots.txt.
Robust against unstable networks: automatically retries and resumes partially downloaded files.
Works well with narrow bandwidth or unreliable connections.
Option Categories
Startup Options
-V, --version– display version and exit. -h, --help – show help. -b, --background – run in background. -e, --execute=COMMAND – execute commands from a .wgetrc file.
Logging Options
-o, --output-file=FILE– write log to FILE. -a, --append-output=FILE – append log to FILE. -d, --debug – debug output. -q, --quiet – suppress output. -v, --verbose – default verbose mode. -nv, --non-verbose – less verbose, not quiet.
Input Options
-i, --input-file=FILE– read URLs from FILE. -F, --force-html – treat input as HTML. -B, --base=URL – prepend URL to relative links.
Download Options
--bind-address=ADDRESS– local address to bind. -t, --tries=NUMBER – max retry attempts (0 = unlimited). -O, --output-document=FILE – save to FILE. -nc, --no-clobber – do not overwrite existing files. -c, --continue – resume partial download. --limit-rate=RATE – limit download speed. -N, --timestamping – download only newer files. -S, --server-response – show server response. --spider – check URL without downloading. -T, --timeout=SECONDS – set network timeout. -w, --wait=SECONDS – wait between retries. --waitretry=SECONDS – wait between retries after failure. --random-wait – random wait between 0 and 2×wait. -Y, --proxy=on/off – enable/disable proxy. -Q, --quota=NUMBER – total download quota. --limit-rate=RATE – limit transfer rate.
Directory Options
-nd, --no-directories– do not create directories. -x, --force-directories – create directories. -nH, --no-host-directories – omit host directory. -P, --directory-prefix=PREFIX – save files under PREFIX. --cut-dirs=NUMBER – ignore NUMBER leading remote directories.
HTTP Options
--http-user=USER– set HTTP username. --http-passwd=PASS – set HTTP password. -C, --cache=on/off – enable/disable server caching. -E, --html-extension – save .html extension. --ignore-length – ignore Content‑Length header. --header=STRING – add custom header. --proxy-user=USER – proxy username. --proxy-passwd=PASS – proxy password. --referer=URL – send Referer header. -s, --save-headers – save HTTP headers. -U, --user-agent=AGENT – set User‑Agent string. --no-http-keep-alive – disable keep‑alive. --cookies=off – disable cookies. --load-cookies=FILE – load cookies from FILE. --save-cookies=FILE – save cookies to FILE.
FTP Options
-nr, --dont-remove-listing– keep .listing file. -g, --glob=on/off – enable/disable filename globbing. --passive-ftp – use passive mode (default). --active-ftp – use active mode. --retr-symlinks – follow symlinks as files.
Recursive Download Options
-r, --recursive– enable recursion (use with care). -l, --level=NUMBER – max recursion depth (0 or inf = unlimited). --delete-after – delete files after download. -k, --convert-links – convert links for local browsing. -K, --backup-converted – backup before conversion. -m, --mirror – shortcut for -r -N -l inf -nr. -p, --page-requisites – download all assets needed to display HTML page. -A, --accept=LIST – accept only listed extensions. -R, --reject=LIST – reject listed extensions. -D, --domains=LIST – limit to listed domains. --exclude-domains=LIST – exclude listed domains. --follow-ftp – follow FTP links in HTML. --follow-tags=LIST – follow listed HTML tags. -G, --ignore-tags=LIST – ignore listed HTML tags. -H, --span-hosts – follow links to other hosts. -L, --relative – follow only relative links. -I, --include-directories=LIST – include only listed directories. -X, --exclude-directories=LIST – exclude listed directories. -np, --no-parent – do not ascend to parent directory.
Practical Examples
1. Download a single file
wget http://www.linuxidc.com/linuxidc.zipThis downloads the file to the current directory while showing progress.
2. Save with a different name
wget -O wordpress.zip http://www.linuxidc.com/download.aspx?id=1080Specifies the output filename to avoid the default name derived from the URL.
3. Limit download speed
wget --limit-rate=300k http://www.linuxidc.com/linuxidc.zipRestricts bandwidth usage so other network activities remain responsive.
4. Resume an interrupted download
wget -c http://www.linuxidc.com/linuxidc.zipContinues a partially downloaded file instead of restarting.
5. Background download
wget -b http://www.linuxidc.com/linuxidc.zipRuns wget in the background; progress is written to wget‑log.
6. Spoof User‑Agent
wget --user-agent="Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16" http://www.linuxidc.com/linuxidc.zipSome sites block non‑browser agents; this option disguises wget as a browser.
7. Test URL with spider mode
wget --spider URLChecks if a remote file exists without downloading it.
8. Increase retry attempts
wget --tries=40 URLUseful for unstable connections or large files.
9. Download multiple URLs from a file
wget -i filelist.txtReads a list of URLs (one per line) and downloads them.
10. Mirror an entire website
wget --mirror -p --convert-links -P ./LOCAL URLCreates a local copy of the site with all necessary assets.
11. Exclude certain file types
wget --reject=gif URLDownloads the site but skips GIF images.
12. Log download output
wget -o download.log URLSaves all console output to a log file.
13. Limit total download size
wget -Q5m -i filelist.txtStops the recursive download once 5 MiB have been transferred.
14. Download only specific extensions
wget -r -A.pdf URLRecursively fetches only PDF files.
15. FTP download with authentication
wget --ftp-user=USERNAME --ftp-password=PASSWORD ftp://example.com/file.zipDownloads a file from an FTP server using provided credentials.
Building wget from Source
To compile and install wget manually:
# tar zxvf wget-1.9.1.tar.gz
# cd wget-1.9.1
# ./configure
# make
# make installSource: CU技术社区
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
