Mirroring Website Using Wget


I use wget a lot to mirror website. Common command I use:

$ wget –mirror -p –convert-links -P ./YOUR-LOCAL-DIR WEBSITE-URL

mirror : turn on options suitable for mirroring.
-p : download all files that are necessary to properly display a given HTML page.
convert-links : after the download, convert the links in document for local viewing.
-P ./LOCAL-DIR : save all the files and directories to the specified directory.

Another common option I use is -np (or –no-parent).

$ wget –mirror np –convert-links -P ./YOUR-LOCAL-DIR WEBSITE-URL

-np : do not ever ascend to the parent directory when retrieving recursively. It guarantees that only the files below a certain hierarchy will be downloaded. It’s useful when you want mirror certain part of the URL(not the entire site).

Also just in case you want mirror website via network proxy. Set http_proxy first to point to your network proxy, then enter your username and password via wget command, like so

$ http_proxy=http://proxyaddress:port/
$ wget –proxy-user=your_username –proxy-password=your_password –mirror -p –convert-links -P /YOUR-LOCAL-DIR/ WEBSITE-URL

The Ultimate Wget Download Guide With 15 Awesome Examples | via TheGeekStuff


