Linux provides different tools to download files via different types of protocols like HTTP, FTP, HTTPS, etc. wget is the most popular tool used to download files via a command-line interface. Wget is supported by Linux, BSD, Windows, MacOSX. Wget has a rich feature set some of them can be listed
- Resume downloads
- Multiple file download single command
- Proxy support
- Unattended download
wget Command Help
Help information of wget can be listed like below.
$ wget -h
The most common usage of wget command is without providing any option or parameter except the download URL. We will use wget only to provide the download URL.
$ wget http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Set Different File Name
While downloading the downloaded file is named the same as provided in the download URL. In the previous example, the file is named as
wget-1.19.tar.gz as the URL provides this name. We can change the saved file name different than the URL. We will use
-O option with name parameter to set the name as
wget.tar.gz and so will remove the version part from the file name.
$ wget -O wget.tar.gz http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Download Multiple Files
There is an other useful feature of wget which gives us the ability to download multiple files. We will provide multiple URLs in a single command. This files will be downloaded and named as in the URL. There is no explicit option specification required. We just provide URLs in a row by separating them with spaces. There is no limit the count of URLs
$ wget http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz http://ftp.gnu.org/gnu/wget/wget-1.18.tar.gz
Read Download URLs From File
In the previous example, we have downloaded multiple files by specifying the URLs from the command like. This may become difficult to manage if there are 100 URLs to download. Another situation may be that the URLs are provided externally in a plain text format line by line. Providing these URLs into the command line is a hard and error-prone job. Hopefully, wget have the feature to read URLs from a file line by line just specifying the file name. We will provide the URLs in a plain text file named
downloads.txt line by line with
downloads.txt File Content
And we will download. It have very clear presentation.
$ wget -i downloads.txt
Resume Uncompleted Download
Another great feature of the wget command is resuming downloads from where it left. Especially in big files, the download may be interrupted after %98 completion which is a nightmare. The
-c option is provided to resume the download without starting it from scratch.
$ wget -c http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
As we can see from the screenshot the downloaded part of the file is presented as
+ plus sign in the download bar. Also, there is information about renaming length and already downloaded length with line starts
Lenght:... .There is also information like
Do Not Overwrite Existing File
By default wget do not overwrite to the file. If it see the same file name exists with the downloaded file it appends
.1 to the end of downloaded file. This
.1 is incremented if it is all ready exists.
Download Files In Background
Wget starts as an interactive process for the download. During the download, the wget process remains a foreground process. But in some situations, we may need to send wget to the background. With
-b option the wget command will send to the background.
$ wget -b http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Restrict Download Speed
By default, the download speed of the wget will be unrestricted. So it will consume the bandwidth according to remote site upload speed. This behavior can be not suitable for some situations. We may want to not use whole bandwidth and remain bandwidth to the other critical applications. We can use
--limit-rate options with the bandwidth value. In this example, we set bandwidth as
$ wget --limit-rate=50K http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
As we can see down left corner the bandwidth is limited with 50KB/s
Specify FTP Username and Password For Login
Security is an important issue nowadays. So FTP servers are not so secure but they try to implement some security steps like user name and password. wget can use FTP authentication with an FTP user name and password. We will use
--ftp-user for specifying the username and
--ftp-password to specify the FTP password.
$ wget --ftp-user=anonymous --ftp-password=mypass ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Specify HTTP Username and Password
Like in the previous example we can specify the HTTP username and password. We will use
--http-user to specify HTTP user name and
--http-password for the HTTP password.
$ wget --http-user=anonymous --http-password=mypass https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Change User Agent Information
While connecting web pages HTTP protocol provides information about the user browser. The same information is provided while downloading files with wget. Some web sites can deny access to the non-standard browser. Wget provides user agent different from standard browsers by default. This browser information can be provided with
--user-agent parameter with the name of the browser. In this example, we will provide the browser information as
Mozilla Firefox .
$ wget --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:126.96.36.199) Gecko/2008092416 Firefox/39 " https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Test Download URLs
Another useful feature of the wget is testing URLs before downloading them. This will give some hints about the URL and file status. We will use
--spider options to check if a remote file exists and downloadable.
$ wget --spider https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
As we can see from screenshot there is a message expressing
Remote file exists. in the last line. We can also get the size of the file without downloading.
Set Retry Count
In problematic networks or download servers, there may be some download problems. The most known problem is can not access the remote file for short periods. We can set retry count which will try to download for specified times.
--tries options with the retry count can be used.
$ wget --tries=10 https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Download Whole Web Site or Spider Website
There are different tools to download the whole site just providing the homepage URL. wget has this ability too. This command will spider and download all pages about this URL and subpages. This will make the site offline accessible. We will use
--mirror to download the whole site and provide the location to download with
-P parameter. Here we want to download the
www.poftut.com and its subpages
$ wget --mirror https://www.poftut.com -P poftut.com
Download Specific File Type
While downloading multiple files or mirroring a site we may want to only download a specific file or file extension. This can be specified with
-A and extension or some part of the file name. In this example, we only want to download the
$ wget -A '.txt' --mirror https://www.poftut.com
Do Not Download Specific File Types
While downloading multiple URLs or mirroring a site there will be a lot of files we do not want to download or too big to download. We will use –reject option by providing extensions or file names.
$ wget --reject '.txt' --mirror https://www.poftut.com
Log To File
By default logs created by wget are printed to the standard output which is generally the command line interface we are using. But using remotely or as a batch or background process, we can not get logs directly. so writing the logs to file is the best solution. We will use
-o option with the log file name.
$ wget -o wget.log http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Set Download Size Limit
Another good option to set limit total downloaded file size is
-Q . We will set download size as
2 MB in the example. This setting is not effective for single file download. It will effect for recursive or mirroring of sites.
$ wget -Q5m https://www.poftut.com
Show Version Information
Version information about the wget command can be get with
$ wget --version