Essential GNU Wget Command Examples for Beginners

Mastering GNU Wget: A Comprehensive TutorialGNU Wget is an incredibly powerful tool used for downloading files from the web. Its flexibility, capability to fetch files over various protocols, and robustness in handling interruptions make it a favorite among developers, sysadmins, and anyone who interacts with the internet through scripts. This comprehensive tutorial will guide you through the essential features of Wget, from basic commands to advanced use cases.

What is GNU Wget?

GNU Wget is a free utility designed for non-interactive downloading of files from the web. It supports HTTP, HTTPS, and FTP protocols, making it versatile for accessing a wide range of resources. Here are some of its key features:

Resilient Downloads: Wget can resume broken downloads and handle network interruptions gracefully.
Recursive Downloading: It’s capable of fetching entire websites or directory structures.
User-Agent Modification: You can change the User-Agent string, enabling you to bypass restrictions on certain servers.

Installing Wget

Linux

Most Linux distributions come with Wget pre-installed, but if you need to install it:

Debian/Ubuntu:
```
sudo apt install wget 
```
Fedora:
```
sudo dnf install wget 
```

Windows

On Windows, you can download a compiled version of Wget:

Visit the official GNU Wget website or a trusted source.
Download the appropriate binary.
Add Wget to your system PATH so you can run it from the command line.

macOS

For macOS, you can use Homebrew:

brew install wget

Basic Usage of Wget

The simplest way to use Wget is to specify a URL to download:

wget http://example.com/file.zip

Command Breakdown

wget: The command to run Wget.
http://example.com/file.zip: The URL of the file you wish to download.

By default, Wget saves the file in the current directory with its original filename.

Downloading Multiple Files

You can download multiple files by listing them in a text file and using the -i option:

wget -i file_list.txt

Commonly Used Options

Wget comes with several options that enhance its functionality. Below are some commonly used options.

1. Resume Downloads

To resume a partially completed download, use the -c option:

wget -c http://example.com/file.zip

2. Mirror a Website

To download an entire website, use the -m option (mirror):

wget -m http://example.com

This command will create a local copy of the website, including all necessary resources.

3. Set Output Filename

Use the -O option to save the downloaded file with a specified name:

wget -O new_name.zip http://example.com/file.zip

4. Download in the Background

To let Wget run in the background, use:

wget -b http://example.com/file.zip

This will allow you to continue using the terminal while the download proceeds.

Advanced Features

Recursive Downloads

Wget’s recursive downloading is powerful for scraping entire websites.

wget -r -l 1 http://example.com

-r: Enables recursive downloading.
-l 1: Limits the recursion depth to 1.

Set User-Agent

Changing the User-Agent can be necessary for accessing certain sites:

wget --user-agent="Mozilla/5.0" http://example.com

Limit Download Speed

To limit the bandwidth used by Wget, you can use:

wget --limit-rate=200k http://example.com/file.zip

This restricts the download speed to 200 KB/s.

Handling Authentication

Basic Authentication

If you need to download files from a server that requires authentication, you can use:

wget --user=username --password=password http://example.com/file.zip

Cookies

Wget can also work with cookies to maintain a session:

wget --load-cookies cookies.txt http://example.com/file.zip

Make sure to export your cookies to a text file before running this command.

Practical Examples

Example 1: Downloading All Images from a Webpage

wget -r -l 1 -A jpeg,jpg,bmp,gif,png http://example.com/gallery

This command downloads all image files from the specified gallery webpage.

Example 2: Downloading Files with a Specific Pattern

”`bash wget -r -l 2 -A “*.pdf” http://example.com/documents