A website copy with HTTrack
Tested with version WinHTTrack Website Copier 3.30-RC-5 (+swf)
Kakadu National Park June 2003
project name: KakaduWeb(URL) addresses:
http://www.pbs.org/edens/kakadu/
http://members.ozemail.com.au/~eparker/kakadu/kakadu.html
http://www.atn.com.au/nt/north/kakadu.htm
http://www.ea.gov.au/parks/kakadu/
http://www.northernterritory.com/northernterritory/index.cfm?attributes.fuseaction=MainFrame&id=Kakadu
amount of time: 3 hours (56k modem)
in the scan rules, add:
+*.png +*.gif +*.jpg +*.css +*.js
-*.hqx
-*.exe -*.zip -*.doc not to download very big Word files
-*.pdf if you you don't need them. There are many.
+*/kakadu.html to mirror all the external links about the park (especially www.wcmc.org.uk\protected_areas\data\wh\kakadu.html)
in the tab Limits limit to 100000 or 200000 the maximum size of non-HTML files.
problem:
Huge siteOther examples with similar difficulties: Marian High | Areaparks
solution:
The limits listed above allow you to mirror session by session.You can modify them to add big files. Not to overload servers, set limits for the max tranfer rate and the max connections per second.
You can also add Web addresses or scan rules if you find some interesting internal or external links, but set a site size limit if you don't want to capture the Web.
You can browse any Web address captured as WinHTTrack made an index file.