wget is a powerful, free utility available in the Unix world. It is a command-line tool used for downloading files from the web. wget supports downloading via HTTP, HTTPS, and FTP protocols, the most popular TCP/IP-based protocols used for web browsing and file transfer on the internet.
One of the key features of wget is its ability to work in the background and complete downloads non-interactively. It can resume aborted downloads, which is a boon when dealing with large files or unstable network connections. wget can also mirror websites, allowing offline browsing and local site backups.
Before diving into the wget command’s usage, it’s essential to ensure it‘s installed on your system. Most Linux distributions include wget by default. However, if it’s not present, you can install it using the package manager of your distribution. For Debian-based systems like Ubuntu, use the
sudo apt install wget
For Red Hat-based systems like CentOS, use the
sudo yum install wget
Download a Single File
The basic syntax for downloading a single file with wget is straightforward. Simply type
wget followed by the URL of the file you want to download. For example:
Downloading Multiple Files
To download multiple files, you can use the
-i option followed by a file containing a list of URLs. Each URL should be on a separate line. For instance:
wget -i urls.txt
Resume Interrupted Downloads
If a download is interrupted, you can resume it using the
-c option. This is particularly useful for large files or unstable internet connections.
wget -c http://example.com/largefile.iso
Retry Failed Downloads
--tries option allows you to specify the number of retries if a download fails. For example, to retry a download five times before giving up, you would use:
wget --tries=5 http://example.com/file.txt
Limit Download Speed
To prevent wget from consuming all your bandwidth, you can limit the download speed using the
--limit-rate option. For example, to limit the download speed to 200k, use:
wget --limit-rate=200k http://example.com/file.txt
Download in Background
-b option allows
wget to run in the background, freeing up your terminal for other tasks. The download progress is logged to a file named
wget-log in the current directory.
wget -b http://example.com/file.txt
Mirror a Website
Wget can mirror an entire website for offline viewing using the
-k options. The
-r option enables recursion,
-p downloads all files necessary to display the page properly, and
-k converts links for offline viewing.
wget -r -p -k http://example.com
Spider a Website
--spider option allows
wget to behave like a web spider, checking for broken links without downloading anything. This is useful for web developers checking their sites for broken links.
wget --spider -r http://example.com
Download via Proxy
If you’re behind a proxy server, you can configure wget to use it with the
--proxy option. You’ll need to set the
https_proxy environment variables with your proxy details.
export http_proxy=http://proxyserver:port wget --proxy=on http://example.com
For sites that require authentication, you can use the
--http-password options, or store your credentials in a
.netrc file in your home directory.
wget --http-user=user --http-password=pass http://example.com
To debug downloads, use the
-o option followed by a log file name. This will write detailed information about the download process to the specified file.
wget -o log.txt http://example.com
-O - option allows
wget to write the downloaded data to standard output, which can then be piped to other commands. This is useful for processing the downloaded data on the fly.
wget -O - http://example.com | grep "keyword"
Scripting and Scheduling Downloads
wget’s command-line nature makes it ideal for scripting and scheduling downloads. You can use cron jobs to schedule downloads at specific times. Passing URLs via stdin using
-i - allows wget to read URLs from a pipe, enabling complex download scripts. Logging and monitoring can be done using the
-o option as mentioned earlier.
wget is a versatile tool with a wide range of capabilities. It goes beyond simple file downloads, offering features like website mirroring, download speed limiting, and more. While similar tools like curl exist, wget stands out for its ease of use and powerful features. For further reading, the wget man page and GNU wget manual are excellent resources.
In conclusion, wget is an indispensable tool for any Linux user. Its power and flexibility make it a go-to solution for all kinds of download tasks. Whether you’re a system administrator needing to download system updates or a web developer wanting to mirror a website, wget has you covered.