A website copy with HTTrack

httrack

Extreme World November 2002

project name: extremeworld
Web(URL) address: www.extremeworld.com
amount of time: 1 hour (56k modem)
in the scan rules, add:
-*.rm (to avoid Realplayer movies)

Read the page about Web Spider Traps,
then, in spider tab, select
no robots.txt rules

problems:

  1. Missing images and videos because of robot rules
  2. javascript image gallery

Other examples with similar difficulties: Alton Towers | Herberton | Martin Luther King 2004 | Martin Luther King 2002 | Recycling | Canobie

solutions:

  1. If the option "no robots.txt rules" has not been selected, most images and video files are missing at the end of the capture.
    Select "no robots.txt rules" and continue interrupted download and connect to finish the capture.
  2. The image galleries only show the first photograph. A javascript script controls the manual slideshow.
    Visit the site and load all the images by viewing all the image galleries.
  3. In the cache (Temporary Internet Files) you should find:

    Temporary Internet Files
    Copy all the images of all the slideshows (here the bungee jump gallery) in the capture folders, then delete [1] in the file names.

Now, you can browse the mirror offline.

topTop of the page

With javascript

W3C XHTML 1.0
W3C CSS