Linux provides different tools to download files via different types of protocols like HTTP, FTP, HTTPS, etc. wget is the most popular tool used to download files via a command-line interface. Wget is supported by Linux, BSD, Windows, MacOSX. Wget has a rich feature set some of them can be listed
- Resume downloads
- Multiple file download single command
- Proxy support
- Unattended download
Curl is alternative to wget. To read curl tutorial click this link
wget Command Help
Help information of wget can be listed like below.
$ wget -h

Simple Download
The most common usage of wget command is without providing any option or parameter except the download URL. We will use wget only to provide the download URL.
$ wget http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz

Set Different File Name
While downloading the downloaded file is named the same as provided in the download URL. In the previous example, the file is named as wget-1.19.tar.gz
as the URL provides this name. We can change the saved file name different than the URL. We will use -O
option with name parameter to set the name as wget.tar.gz
and so will remove the version part from the file name.
$ wget -O wget.tar.gz http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz

Download Multiple Files
There is an other useful feature of wget which gives us the ability to download multiple files. We will provide multiple URLs in a single command. This files will be downloaded and named as in the URL. There is no explicit option specification required. We just provide URLs in a row by separating them with spaces. There is no limit the count of URLs
$ wget http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz http://ftp.gnu.org/gnu/wget/wget-1.18.tar.gz

Read Download URLs From File
In the previous example, we have downloaded multiple files by specifying the URLs from the command like. This may become difficult to manage if there are 100 URLs to download. Another situation may be that the URLs are provided externally in a plain text format line by line. Providing these URLs into the command line is a hard and error-prone job. Hopefully, wget have the feature to read URLs from a file line by line just specifying the file name. We will provide the URLs in a plain text file named downloads.txt
line by line with -i
option.
downloads.txt File Content
http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz http://ftp.gnu.org/gnu/wget/wget-1.18.tar.gz
And we will download. It have very clear presentation.
$ wget -i downloads.txt

Resume Uncompleted Download
Another great feature of the wget command is resuming downloads from where it left. Especially in big files, the download may be interrupted after %98 completion which is a nightmare. The -c
option is provided to resume the download without starting it from scratch.
$ wget -c http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz

As we can see from the screenshot the downloaded part of the file is presented as +
plus sign in the download bar. Also, there is information about renaming length and already downloaded length with line starts Lenght:...
.There is also information like Partial Content
Do Not Overwrite Existing File
By default wget do not overwrite to the file. If it see the same file name exists with the downloaded file it appends .1
to the end of downloaded file. This .1
is incremented if it is all ready exists.
Download Files In Background
Wget starts as an interactive process for the download. During the download, the wget process remains a foreground process. But in some situations, we may need to send wget to the background. With -b
option the wget command will send to the background.
$ wget -b http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz

Restrict Download Speed
By default, the download speed of the wget will be unrestricted. So it will consume the bandwidth according to remote site upload speed. This behavior can be not suitable for some situations. We may want to not use whole bandwidth and remain bandwidth to the other critical applications. We can use --limit-rate
options with the bandwidth value. In this example, we set bandwidth as 50KB/s
.
$ wget --limit-rate=50K http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz

As we can see down left corner the bandwidth is limited with 50KB/s
Specify FTP Username and Password For Login
Security is an important issue nowadays. So FTP servers are not so secure but they try to implement some security steps like user name and password. wget can use FTP authentication with an FTP user name and password. We will use --ftp-user
for specifying the username and --ftp-password
to specify the FTP password.
$ wget --ftp-user=anonymous --ftp-password=mypass ftp.gnu.org/gnu/wget/wget-1.19.tar.gz

Specify HTTP Username and Password
Like in the previous example we can specify the HTTP username and password. We will use --http-user
to specify HTTP user name and --http-password
for the HTTP password.
$ wget --http-user=anonymous --http-password=mypass https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz

Change User Agent Information
While connecting web pages HTTP protocol provides information about the user browser. The same information is provided while downloading files with wget. Some web sites can deny access to the non-standard browser. Wget provides user agent different from standard browsers by default. This browser information can be provided with --user-agent
parameter with the name of the browser. In this example, we will provide the browser information as Mozilla Firefox
.
$ wget --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/39
" https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz

Test Download URLs
Another useful feature of the wget is testing URLs before downloading them. This will give some hints about the URL and file status. We will use --spider
options to check if a remote file exists and downloadable.
$ wget --spider https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz

As we can see from screenshot there is a message expressing Remote file exists.
in the last line. We can also get the size of the file without downloading.
Set Retry Count
In problematic networks or download servers, there may be some download problems. The most known problem is can not access the remote file for short periods. We can set retry count which will try to download for specified times. --tries
options with the retry count can be used.
$ wget --tries=10 https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Download Whole Web Site or Spider Website
There are different tools to download the whole site just providing the homepage URL. wget has this ability too. This command will spider and download all pages about this URL and subpages. This will make the site offline accessible. We will use --mirror
to download the whole site and provide the location to download with -P
parameter. Here we want to download the www.poftut.com
and its subpages
$ wget --mirror https://www.poftut.com -P poftut.com

Download Specific File Type
While downloading multiple files or mirroring a site we may want to only download a specific file or file extension. This can be specified with -A
and extension or some part of the file name. In this example, we only want to download the .txt
files.
$ wget -A '.txt' --mirror https://www.poftut.com

Do Not Download Specific File Types
While downloading multiple URLs or mirroring a site there will be a lot of files we do not want to download or too big to download. We will use –reject option by providing extensions or file names.
$ wget --reject '.txt' --mirror https://www.poftut.com
Log To File
By default logs created by wget are printed to the standard output which is generally the command line interface we are using. But using remotely or as a batch or background process, we can not get logs directly. so writing the logs to file is the best solution. We will use -o
option with the log file name.
$ wget -o wget.log http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz

Set Download Size Limit
Another good option to set limit total downloaded file size is -Q
. We will set download size as 2 MB
in the example. This setting is not effective for single file download. It will effect for recursive or mirroring of sites.
$ wget -Q5m https://www.poftut.com
Show Version Information
Version information about the wget command can be get with --version
option.
$ wget --version

first example has a severe typo –
It documents “-o” (lower case “o”) as the option to name the downloaded file.
This is incorrect. the correct option for that is “-O” (upper case “O”)
Lower case “O’ is used to redirect the reported info that wget normally sends to stdout.
Fortunately he gets it correct in the example.
Your date for the article shows as 7/03/2017, I believe you’re a month off
Actually I have write windows and programming related tutorials
Nope, He is using a different date format: “dd/mm/yyyy” instead of the one you and I are most familiar with “mm/dd/yyyy”
Here is another one of his articles from 31/05/2017 as proof.
https://www.poftut.com/yara-identify-classify-malware-samples/
This is a great introduction to wget. It is well presented and easy to understand.
Thank you for pointing out my simple mistake. I know of the alternative date format but wasn’t thinking that night.