Operations 18 min read

Master wget: Powerful Command-Line Download Techniques for Linux

This guide explains how to use the Linux wget utility for reliable, automated file downloads, covering basic syntax, essential options for background execution, recursive mirroring, bandwidth limiting, proxy handling, logging, and a dozen practical examples that demonstrate common scenarios such as resuming interrupted transfers and downloading entire websites.

21CTO
21CTO
21CTO
Master wget: Powerful Command-Line Download Techniques for Linux

wget is a command‑line tool on Linux for downloading files over HTTP, HTTPS, and FTP, supporting proxies and capable of running in the background after the user logs out, which makes it far more convenient than graphical browsers for large or unattended downloads.

Command Syntax

wget [options] [URL]

Key Features

Supports recursive downloading to recreate remote directory structures while respecting robots.txt.

Robust against unstable networks: automatically retries and resumes partially downloaded files.

Works well with narrow bandwidth or unreliable connections.

Option Categories

Startup Options

-V, --version

– display version and exit. -h, --help – show help. -b, --background – run in background. -e, --execute=COMMAND – execute commands from a .wgetrc file.

Logging Options

-o, --output-file=FILE

– write log to FILE. -a, --append-output=FILE – append log to FILE. -d, --debug – debug output. -q, --quiet – suppress output. -v, --verbose – default verbose mode. -nv, --non-verbose – less verbose, not quiet.

Input Options

-i, --input-file=FILE

– read URLs from FILE. -F, --force-html – treat input as HTML. -B, --base=URL – prepend URL to relative links.

Download Options

--bind-address=ADDRESS

– local address to bind. -t, --tries=NUMBER – max retry attempts (0 = unlimited). -O, --output-document=FILE – save to FILE. -nc, --no-clobber – do not overwrite existing files. -c, --continue – resume partial download. --limit-rate=RATE – limit download speed. -N, --timestamping – download only newer files. -S, --server-response – show server response. --spider – check URL without downloading. -T, --timeout=SECONDS – set network timeout. -w, --wait=SECONDS – wait between retries. --waitretry=SECONDS – wait between retries after failure. --random-wait – random wait between 0 and 2×wait. -Y, --proxy=on/off – enable/disable proxy. -Q, --quota=NUMBER – total download quota. --limit-rate=RATE – limit transfer rate.

Directory Options

-nd, --no-directories

– do not create directories. -x, --force-directories – create directories. -nH, --no-host-directories – omit host directory. -P, --directory-prefix=PREFIX – save files under PREFIX. --cut-dirs=NUMBER – ignore NUMBER leading remote directories.

HTTP Options

--http-user=USER

– set HTTP username. --http-passwd=PASS – set HTTP password. -C, --cache=on/off – enable/disable server caching. -E, --html-extension – save .html extension. --ignore-length – ignore Content‑Length header. --header=STRING – add custom header. --proxy-user=USER – proxy username. --proxy-passwd=PASS – proxy password. --referer=URL – send Referer header. -s, --save-headers – save HTTP headers. -U, --user-agent=AGENT – set User‑Agent string. --no-http-keep-alive – disable keep‑alive. --cookies=off – disable cookies. --load-cookies=FILE – load cookies from FILE. --save-cookies=FILE – save cookies to FILE.

FTP Options

-nr, --dont-remove-listing

– keep .listing file. -g, --glob=on/off – enable/disable filename globbing. --passive-ftp – use passive mode (default). --active-ftp – use active mode. --retr-symlinks – follow symlinks as files.

Recursive Download Options

-r, --recursive

– enable recursion (use with care). -l, --level=NUMBER – max recursion depth (0 or inf = unlimited). --delete-after – delete files after download. -k, --convert-links – convert links for local browsing. -K, --backup-converted – backup before conversion. -m, --mirror – shortcut for -r -N -l inf -nr. -p, --page-requisites – download all assets needed to display HTML page. -A, --accept=LIST – accept only listed extensions. -R, --reject=LIST – reject listed extensions. -D, --domains=LIST – limit to listed domains. --exclude-domains=LIST – exclude listed domains. --follow-ftp – follow FTP links in HTML. --follow-tags=LIST – follow listed HTML tags. -G, --ignore-tags=LIST – ignore listed HTML tags. -H, --span-hosts – follow links to other hosts. -L, --relative – follow only relative links. -I, --include-directories=LIST – include only listed directories. -X, --exclude-directories=LIST – exclude listed directories. -np, --no-parent – do not ascend to parent directory.

Practical Examples

1. Download a single file

wget http://www.linuxidc.com/linuxidc.zip

This downloads the file to the current directory while showing progress.

2. Save with a different name

wget -O wordpress.zip http://www.linuxidc.com/download.aspx?id=1080

Specifies the output filename to avoid the default name derived from the URL.

3. Limit download speed

wget --limit-rate=300k http://www.linuxidc.com/linuxidc.zip

Restricts bandwidth usage so other network activities remain responsive.

4. Resume an interrupted download

wget -c http://www.linuxidc.com/linuxidc.zip

Continues a partially downloaded file instead of restarting.

5. Background download

wget -b http://www.linuxidc.com/linuxidc.zip

Runs wget in the background; progress is written to wget‑log.

6. Spoof User‑Agent

wget --user-agent="Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16" http://www.linuxidc.com/linuxidc.zip

Some sites block non‑browser agents; this option disguises wget as a browser.

7. Test URL with spider mode

wget --spider URL

Checks if a remote file exists without downloading it.

8. Increase retry attempts

wget --tries=40 URL

Useful for unstable connections or large files.

9. Download multiple URLs from a file

wget -i filelist.txt

Reads a list of URLs (one per line) and downloads them.

10. Mirror an entire website

wget --mirror -p --convert-links -P ./LOCAL URL

Creates a local copy of the site with all necessary assets.

11. Exclude certain file types

wget --reject=gif URL

Downloads the site but skips GIF images.

12. Log download output

wget -o download.log URL

Saves all console output to a log file.

13. Limit total download size

wget -Q5m -i filelist.txt

Stops the recursive download once 5 MiB have been transferred.

14. Download only specific extensions

wget -r -A.pdf URL

Recursively fetches only PDF files.

15. FTP download with authentication

wget --ftp-user=USERNAME --ftp-password=PASSWORD ftp://example.com/file.zip

Downloads a file from an FTP server using provided credentials.

Building wget from Source

To compile and install wget manually:

# tar zxvf wget-1.9.1.tar.gz
# cd wget-1.9.1
# ./configure
# make
# make install
Source: CU技术社区
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AutomationLinuxFile DownloadNetworkingcommand-linewget
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.