Linux provides different tools to download files via different type of protocols like HTTP, FTP, HTTPS etc.
wget is the most popular tool used to download files via command line interface. Wget is supported by Linux, BSD, Windows, MacOSX. Wget has rich feature set some of them can be listed
- Resume downloads
- Multiple file download single command
- Proxy support
- Unattended download
Help information of wget can be listed like below.
$ wget -h
Most common usage of wget command is without providing any option or parameter except the download URL. We will use wget only providing the download URL.
$ wget http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Set Different File Name
While downloading the downloaded file is named the same as provided in the download URL. In previous example the file is named as
wget-1.19.tar.gz as the URL provides this name. We can change the saved file name different then URL. We will use
-O option with name parameter to set the name as
wget.tar.gz and so will remove the version part from the file name.
$ wget -O wget.tar.gz http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Download Multiple Files
There is an other useful feature of wget which gives us the ability to download multiple files. We will provide multiple URLs in a single command. This files will be downloaded and named as in the URL. There is no explicit option specification required. We just provide URLs in a row by separating them with spaces. There is no limit the count of URLs
$ wget http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz http://ftp.gnu.org/gnu/wget/wget-1.18.tar.gz
Read Download URLs From File
In the previous example we have downloaded multiple files by specifying the URLs from command like. This may became difficult to manage if there are 100 URLs to download. Another situation may be that the URLs are provided externally in a plain text format line by line. Providing these URLs into command line is hard and error prone job.
Hopefully wget have the feature to read URLs from a file line by line just specifying the file name. We will provide the URLs in a plan text file named
downloads.txt line by line with
And we will download. It have very clear presentation
$ wget -i downloads.txt
Resume Uncompleted Download
Another great feature of the wget command is resuming downloads from where it left. Especially in big files the download may be interrupted after %98 completion which is a nightmare. The
-c option is provided to resume the download without starting it from scratch.
$ wget -c http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
As we can see from screenshot the downloaded part of the file is presented as
+ in the download bar. Also there is information about renaming length and all ready downloaded length with line starts
Lenght:... .There is also information like
Do Not Overwrite
By default wget do not overwrite to the file. If it see the same file name exists with the downloaded file it appends
.1 to the end of downloaded file. This
.1 is incremented if it is all ready exists.
Download Files In Background
Wget starts as interactive process for the download. During download the wget process remains as foreground process. But in some situations we may need to send wget to the background. With
-b option the wget command will send to the background.
$ wget -b http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Restrict Download Speed
By default the download speed of the wget will be unrestricted. So it will consume the bandwidth according to remote site upload speed. This behavior can be not suitable for some situations. We may want to not use whole bandwidth and remain bandwidth to the other critical applications. We can use
--limit-rate options with the bandwidth value. In this example we set bandwidth as
$ wget --limit-rate=50K http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
As we can see down left corner the bandwidth is limited with 50KB/s
Specify FTP Username and Password
Security is important issue nowadays. So ftp servers are not so secure but they try to implement some security steps like user name and password. wget can use ftp authentication with ftp user name and password. We will use
--ftp-user for specify username and
--ftp-password to specify ftp password.
$ wget --ftp-user=anonymous --ftp-password=mypass ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Specify HTTP Username and Password
Like in previous example we can specify the HTTP username and password. We will use
--http-user to specify http user name and
--http-password for HTTP password.
$ wget --http-user=anonymous --http-password=mypass https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Change User Agent Information
While connecting web pages HTTP protocol provides information about the user browser. The same information is provided while downloading files with wget. Some web sites can deny access to the non standard browser. Wget provides user agent different from standard browsers by default. This browser information can be provided with
--user-agent parameter with the name of the browser. In this example we will provide the browser information as
Mozilla Firefox .
$ wget --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:22.214.171.124) Gecko/2008092416 Firefox/39
Test Download URLs
Another useful feature of the wget is testing URLs before downloading them. This will give some hints about the URL and file status. We will use
--spider options to check if a remote file exists and downloadable.
$ wget --spider https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
As we can see from screenshot there is a message expressing
Remote file exists. in the last line. We can also get the size of the file without downloading.
Set Retry Count
In problematic networks or download servers there may be some download problems. The most known problem is can not accessing the remote file for short periods. We can set retry count which will try to download for specified times.
--tries options with the retry count can be used.
$ wget --tries=10 https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Download Whole Web Site or Spider Website
There are different tools to download the whole site just providing the homepage URL. wget has this ability too. This command will spider and download all pages about this URL and sub pages. This will make the site offline accessible. We will use
--mirror to download whole site and provide the location to download with
-P parameter. Here we want to download the
www.poftut.com and its subpages
$ wget --mirror https://www.poftut.com -P poftut.com
Download Specific File Type
While downloading multiple files or mirroring a site we may want to only download a specific file or file extension. This can be specified with
-A and extension or some part of file name. In this example we only want to download the
$ wget -A '.txt' --mirror https://www.poftut.com
Do Not Download Specific File Types
While downloading multiple URLs or mirroring a site there will be a lot of files we do not want to download or too big to download. We will use
--reject option by providing extensions or files names.
$ wget --reject '.txt' --mirror https://www.poftut.com
Log To File
By default logs created by wget is printed to the standard output which is generally the command line interface we are using. But using remotely or as a batch or background process we can not get logs directly. so writing the logs to file is best solution. We will use
-o option with the log file name.
$ wget -o wget.log http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Set Download Size Limit
Another good option to set limit total downloaded file size is
-Q . We will set download size as
2 MB in the example. This setting is not effective for single file download. It will effect for recursive or mirroring of sites.
$ wget -Q5m https://www.poftut.com
Version information about the wget command can be get with
$ wget --version