Wget Command in Linux with Examples
wget is a powerful, free utility available in the Unix world. It is a command-line tool used for downloading files from the web. wget supports downloading via HTTP, HTTPS, and FTP protocols, the most popular TCP/IP-based protocols used for web browsing and file transfer on the internet.
One of the key features of wget is its ability to work in the background and complete downloads non-interactively. It can resume aborted downloads, which is a boon when dealing with large files or unstable network connections. wget can also mirror websites, allowing offline browsing and local site backups.
Installation
Before diving into the wget command’s usage, it’s essential to ensure it‘s installed on your system. Most Linux distributions include wget by default. However, if it’s not present, you can install it using the package manager of your distribution. For Debian-based systems like Ubuntu, use the apt
command:
sudo apt install wget
For Red Hat-based systems like CentOS, use the yum
command:
sudo yum install wget
Downloading Files
Download a Single File
The basic syntax for downloading a single file with wget is straightforward. Simply type wget
followed by the URL of the file you want to download. For example:
wget http://example.com/file.iso.
Downloading Multiple Files
To download multiple files, you can use the -i
option followed by a file containing a list of URLs. Each URL should be on a separate line. For instance:
wget -i urls.txt
Resume Interrupted Downloads
If a download is interrupted, you can resume it using the -c
option. This is particularly useful for large files or unstable internet connections.
wget -c http://example.com/largefile.iso
Retry Failed Downloads
The --tries
option allows you to specify the number of retries if a download fails. For example, to retry a download five times before giving up, you would use:
wget --tries=5 http://example.com/file.txt
Limit Download Speed
To prevent wget from consuming all your bandwidth, you can limit the download speed using the --limit-rate
option. For example, to limit the download speed to 200k, use:
wget --limit-rate=200k http://example.com/file.txt
Download in Background
The -b
option allows wget
to run in the background, freeing up your terminal for other tasks. The download progress is logged to a file named wget-log
in the current directory.
wget -b http://example.com/file.txt
Advanced Usage
Mirror a Website
Wget can mirror an entire website for offline viewing using the -r
, -p
, and -k
options. The -r
option enables recursion, -p
downloads all files necessary to display the page properly, and -k
converts links for offline viewing.
wget -r -p -k http://example.com
Spider a Website
The --spider
option allows wget
to behave like a web spider, checking for broken links without downloading anything. This is useful for web developers checking their sites for broken links.
wget --spider -r http://example.com
Download via Proxy
If you’re behind a proxy server, you can configure wget to use it with the --proxy
option. You’ll need to set the http_proxy
or https_proxy
environment variables with your proxy details.
export http_proxy=http://proxyserver:port wget --proxy=on http://example.com
Authentication
For sites that require authentication, you can use the --http-user
and --http-password
options, or store your credentials in a .netrc
file in your home directory.
wget --http-user=user --http-password=pass http://example.com
Debugging Downloads
To debug downloads, use the -o
option followed by a log file name. This will write detailed information about the download process to the specified file.
wget -o log.txt http://example.com
Pipelines
The -O -
option allows wget
to write the downloaded data to standard output, which can then be piped to other commands. This is useful for processing the downloaded data on the fly.
wget -O - http://example.com | grep "keyword"
Scripting and Scheduling Downloads
wget’s command-line nature makes it ideal for scripting and scheduling downloads. You can use cron jobs to schedule downloads at specific times. Passing URLs via stdin using -i -
allows wget to read URLs from a pipe, enabling complex download scripts. Logging and monitoring can be done using the -o
option as mentioned earlier.
Conclusion
wget is a versatile tool with a wide range of capabilities. It goes beyond simple file downloads, offering features like website mirroring, download speed limiting, and more. While similar tools like curl exist, wget stands out for its ease of use and powerful features. For further reading, the wget man page and GNU wget manual are excellent resources.
In conclusion, wget is an indispensable tool for any Linux user. Its power and flexibility make it a go-to solution for all kinds of download tasks. Whether you’re a system administrator needing to download system updates or a web developer wanting to mirror a website, wget has you covered.