Wget Command Tutorial – POFTUT

Wget Command Tutorial


Linux provides different tools to download files via different types of protocols like HTTP, FTP, HTTPS, etc. wget is the most popular tool used to download files via a command-line interface. Wget is supported by Linux, BSD, Windows, MacOSX. Wget has a rich feature set some of them can be listed

  • Resume downloads
  • Multiple file download single command
  • Proxy support
  • Unattended download

Curl is alternative to wget. To read curl tutorial click this link

wget Command Help

Help information of wget can be listed like below.

$ wget -h
Wget Help
Wget Help

Simple Download

The most common usage of wget command is without providing any option or parameter except the download URL. We will use wget only to provide the download URL.

$ wget http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Simple Download
Simple Download

Set Different File Name

While downloading the downloaded file is named the same as provided in the download URL. In the previous example, the file is named as wget-1.19.tar.gz as the URL provides this name. We can change the saved file name different than the URL. We will use -O option with name parameter to set the name as wget.tar.gz and so will remove the version part from the file name.

$ wget -O wget.tar.gz http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Set Different File Name
Set Different File Name

Download Multiple Files

There is an other useful feature of wget which gives us the ability to download multiple files. We will provide multiple URLs in a single command. This files will be downloaded and named as in the URL. There is no explicit option specification required. We just provide URLs in a row by separating them with spaces. There is no limit the count of URLs

$ wget http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz http://ftp.gnu.org/gnu/wget/wget-1.18.tar.gz
Download Multiple Files
Download Multiple Files

Read Download URLs From File

In the previous example, we have downloaded multiple files by specifying the URLs from the command like. This may become difficult to manage if there are 100 URLs to download. Another situation may be that the URLs are provided externally in a plain text format line by line. Providing these URLs into the command line is a hard and error-prone job. Hopefully, wget have the feature to read URLs from a file line by line just specifying the file name. We will provide the URLs in a plain text file named downloads.txt line by line with -i option.

LEARN MORE  How To Find and Open Downloads Folder For Windows, Linux, MacOSX?

downloads.txt File Content

http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz 
http://ftp.gnu.org/gnu/wget/wget-1.18.tar.gz

And we will download. It have very clear presentation.

$ wget -i downloads.txt
Read Download URLs From File
Read Download URLs From File

Resume Uncompleted Download

Another great feature of the wget command is resuming downloads from where it left. Especially in big files, the download may be interrupted after %98 completion which is a nightmare. The -c option is provided to resume the download without starting it from scratch.

$ wget -c http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Resume Uncompleted Download
Resume Uncompleted Download

As we can see from the screenshot the downloaded part of the file is presented as + plus sign in the download bar. Also, there is information about renaming length and already downloaded length with line starts Lenght:... .There is also information like Partial Content

Do Not Overwrite Existing File

By default wget do not overwrite to the file. If it see the same file name exists with the downloaded file it appends .1 to the end of downloaded file. This .1 is incremented if it is all ready exists.

Download Files In Background

Wget starts as an interactive process for the download. During the download, the wget process remains a foreground process. But in some situations, we may need to send wget to the background. With -b option the wget command will send to the background.

$ wget -b http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Download Files In Background
Download Files In Background

Restrict Download Speed

By default, the download speed of the wget will be unrestricted. So it will consume the bandwidth according to remote site upload speed. This behavior can be not suitable for some situations. We may want to not use whole bandwidth and remain bandwidth to the other critical applications. We can use --limit-rate options with the bandwidth value. In this example, we set bandwidth as 50KB/s.

$ wget --limit-rate=50K http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Restrict Download Speed
Restrict Download Speed

As we can see down left corner the bandwidth is limited with 50KB/s

LEARN MORE  How To Install Apache 2.4 and PHP 7.3 On Fedora, CentOS, RHEL Linux?

Specify FTP Username and Password For Login

Security is an important issue nowadays. So FTP servers are not so secure but they try to implement some security steps like user name and password. wget can use FTP authentication with an FTP user name and password. We will use --ftp-user for specifying the username and --ftp-password to specify the FTP password.

$ wget --ftp-user=anonymous --ftp-password=mypass ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Specify FTP Username and Password
Specify FTP Username and Password

Specify HTTP Username and Password

Like in the previous example we can specify the HTTP username and password. We will use --http-user to specify HTTP user name and --http-password for the HTTP password.

$ wget --http-user=anonymous --http-password=mypass https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Specify HTTP Username and Password
Specify HTTP Username and Password

Change User Agent Information

While connecting web pages HTTP protocol provides information about the user browser. The same information is provided while downloading files with wget. Some web sites can deny access to the non-standard browser. Wget provides user agent different from standard browsers by default. This browser information can be provided with --user-agent parameter with the name of the browser. In this example, we will provide the browser information as Mozilla Firefox .

$ wget --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/39
" https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Change User Agent Information
Change User Agent Information

Test Download URLs

Another useful feature of the wget is testing URLs before downloading them. This will give some hints about the URL and file status. We will use --spider options to check if a remote file exists and downloadable.

$ wget --spider https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Test Download URLs
Test Download URLs

As we can see from screenshot there is a message expressing Remote file exists. in the last line. We can also get the size of the file without downloading.

Set Retry Count

In problematic networks or download servers, there may be some download problems. The most known problem is can not access the remote file for short periods. We can set retry count which will try to download for specified times. --tries options with the retry count can be used.

$ wget --tries=10 https://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz

Download Whole Web Site or Spider Website

There are different tools to download the whole site just providing the homepage URL. wget has this ability too. This command will spider and download all pages about this URL and subpages. This will make the site offline accessible. We will use --mirror to download the whole site and provide the location to download with -P parameter. Here we want to download the www.poftut.com and its subpages

$ wget --mirror https://www.poftut.com -P poftut.com
Download Whole Web Site or Spider Website
Download Whole Web Site or Spider Website

Download Specific File Type

While downloading multiple files or mirroring a site we may want to only download a specific file or file extension. This can be specified with -A and extension or some part of the file name. In this example, we only want to download the .txt files.

$ wget -A '.txt' --mirror https://www.poftut.com
Download Specific File Type
Download Specific File Type

Do Not Download Specific File Types

While downloading multiple URLs or mirroring a site there will be a lot of files we do not want to download or too big to download. We will use –reject option by providing extensions or file names.

$ wget --reject '.txt' --mirror https://www.poftut.com

Log To File

By default logs created by wget are printed to the standard output which is generally the command line interface we are using. But using remotely or as a batch or background process, we can not get logs directly. so writing the logs to file is the best solution. We will use -o option with the log file name.

$ wget -o wget.log http://ftp.gnu.org/gnu/wget/wget-1.19.tar.gz
Log To File
Log To File

Set Download Size Limit

Another good option to set limit total downloaded file size is -Q . We will set download size as 2 MB in the example. This setting is not effective for single file download. It will effect for recursive or mirroring of sites.

$ wget -Q5m https://www.poftut.com

Show Version Information

Version information about the wget command can be get with --version option.

$ wget --version
Show Version Information

7 thoughts on “Wget Command Tutorial”

  1. first example has a severe typo –

    It documents “-o” (lower case “o”) as the option to name the downloaded file.

    This is incorrect. the correct option for that is “-O” (upper case “O”)

    Lower case “O’ is used to redirect the reported info that wget normally sends to stdout.

    Fortunately he gets it correct in the example.

    Reply
  2. Thank you for pointing out my simple mistake. I know of the alternative date format but wasn’t thinking that night.

    Reply

Leave a Comment