A website copy with HTTrack

httrack

Tested with version WinHTTrack Website Copier 3.30-RC-5 (+swf)

Kakadu National Park June 2003

project name: Kakadu
Web(URL) addresses:
http://www.pbs.org/edens/kakadu/
http://members.ozemail.com.au/~eparker/kakadu/kakadu.html
http://www.atn.com.au/nt/north/kakadu.htm
http://www.ea.gov.au/parks/kakadu/
http://www.northernterritory.com/northernterritory/index.cfm?attributes.fuseaction=MainFrame&id=Kakadu
amount of time: 3 hours (56k modem)
in the scan rules, add:
+*.png +*.gif +*.jpg +*.css +*.js
-*.hqx
-*.exe -*.zip -*.doc not to download very big Word files
-*.pdf if you you don't need them. There are many.
+*/kakadu.html to mirror all the external links about the park (especially www.wcmc.org.uk\protected_areas\data\wh\kakadu.html)
in the tab Limits limit to 100000 or 200000 the maximum size of non-HTML files.

problem:

Huge site

Other examples with similar difficulties: Marian High | Areaparks

solution:

The limits listed above allow you to mirror session by session.
You can modify them to add big files. Not to overload servers, set limits for the max tranfer rate and the max connections per second.
You can also add Web addresses or scan rules if you find some interesting internal or external links, but set a site size limit if you don't want to capture the Web.
You can browse any Web address captured as WinHTTrack made an index file.
topTop of the page

With javascript

W3C XHTML 1.0
W3C CSS