How to download website from web.archive.org on Windows or Linux
Introduction
It exists a lot of softwares to download websites on internet and surf on them offline. It can be useful to piece of documentation or interesting content. But if the site is offline and only available on web.archive.org, it can be tricky to download it. There is some company which can help you to download web.archive.org content but this is not free ! Hopefully, a nice project was released on GitHub named wayback-machine-downloader and written by hartator. This project is available here : https://github.com/hartator/wayback-machine-downloader.
Process
Please note that native wayback-machine-downloader works only on linux.
Windows
If you are on Linux, please go directly on the next chapter.
You need a Linux Bash simulator to execute wayback-machine-downloader. Personnally, I advice you to get Cygwin Portable. You can get it on this link.
- Install Cygwin Portable on C:\CygwinPortable.
- Launch CygwinPortable.exe as Administrator
- Right Click on Cygwin icons and click on Cygwin Setup
- Install Ruby and Git
- Right Click on Cygwin icons and click on Open Bash (C:)
Linux
- If you are on Linux download and install ruby, rubygems and git
- Open bash shell
Download on web.archive.org
1 2 3 4 |
git clone https://github.com/hartator/wayback-machine-downloader cd wayback-machine-downloader gem install wayback_machine_downloader cd bin |
Then, to download <http://www.exemple.org>
1 |
./wayback_machine_downloader <http://www.exemple.org> --concurrency 20 |
For exemple with satanion.dk (it was a very good Counter-Strike website)
1 |
./wayback_machine_downloader http://www.satanion.dk --concurrency 20 |
Et voilà !
Thanks to hartator for this very good tool !
0 commentaire