A website copy with HTTrack

httrack

Tested with version WinHTTrack Website Copier 3.33-beta-2 (+swf)

Gulliver's Theme Park Warrington July 2004

project name: gulliversfun
Web(URL) addresses:
www.gulliversfun.co.uk/warrington.htm
tick: Attempt to detect all links
amount of time: 20 minutes (56k modem)
in the scan rules, add:
+*.png +*.gif +*.jpg +*.css +*.js
-*.doc -*.pdf -*.zip
-*/miltonkeynes/*
-*/matlockbath/*

problems:

Navigation bar written in Flash and missing files

Other examples with similar difficulties: Cedarpoint | The Engine Room | Discovery Cove | Ratanga | The Engine Room | Wild Waters Park | Rapids Water Park | Camelot Theme Park

solutions:

In the end of the capture, some links of the navigation bar at the bottom of the page are calling missing files.
In the folder gulliversfun/www.gulliversfun.co.uk/anim we can find the Flash file war_menu.swf causing the problem.
SWFRIP or Macromedia swf2html.exe tool will allow us to find the missing addresses and files.
Copy the swf file of the mirror (war_menu.swf) in a folder and launch the following command:
gulliver

Then open(or edit) the generated html file (war_menu.html).

These links have not been parsed.
They should be in the folder www.gulliversfun.co.uk/warrington as we can see it in the browser message when clicking on a broken link.
times.htm, groups.htm, events.htm, contact.htm and shopin.htm are missing.
Hence, we will add in the Web Addresses(URL) to mirror:

http://www.gulliversfun.co.uk/warrington/times.htm
http://www.gulliversfun.co.uk/warrington/groups.htm
http://www.gulliversfun.co.uk/warrington/events.htm
http://www.gulliversfun.co.uk/warrington/contact.htm
http://www.gulliversfun.co.uk/warrington/shopin.htm

We restart the mirror and all the links are working now.
Many pdf, zip and doc files can be downloaded too by modifying the scan rules.

If you want to mirror Milton Keynes and Matlock Bath websites, you have to remove -*/miltonkeynes/* and -*/matlockbath/* from the scan rules and add

http://www.gulliversfun.co.uk/miltonkeynes.htm
http://www.gulliversfun.co.uk/miltonkeynes/times.htm
http://www.gulliversfun.co.uk/miltonkeynes/groups.htm
http://www.gulliversfun.co.uk/miltonkeynes/events.htm
http://www.gulliversfun.co.uk/miltonkeynes/contact.htm
http://www.gulliversfun.co.uk/miltonkeynes/shopin.htm
http://www.gulliversfun.co.uk/matlockbath.htm
http://www.gulliversfun.co.uk/matlockbath/times.htm
http://www.gulliversfun.co.uk/matlockbath/groups.htm
http://www.gulliversfun.co.uk/matlockbath/events.htm
http://www.gulliversfun.co.uk/matlockbath/contact.htm
http://www.gulliversfun.co.uk/matlockbath/shopin.htm

in the Web addresses(URL).

topTop of the page

With javascript

W3C XHTML 1.0
W3C CSS