A website copy with HTTrack

httrack

Areaparks October 2002

project name: areaparks
Web(URL) address: www.areaparks.com
do not tick: DOS names or ISO9660 names
amount of time: 10 hours (56k modem)
in the scan rules, add:
+*[name].areaparks.com/*
To limit the size of the capture, add in the scan rules
-ad.doubleclick.net/* -ad2.doubleclick.net/
-aeraguides.webbanners.net/*
-*.exe -*.zip
-forums.*
-*/spyellow/* (this part of the site can be mirrored later))

problems:

Ads and huge website

Other examples with similar difficulties: Marian High | Kakadu | Travel West

solutions:

In the end of the capture, find with inforapid search and replace for example, all the files with the string http:// and replace it with #.
When browsing online, you cannot load missing pages or look at ads.
When browsing the website offline, no external page is called and no sponsor image displayed, but you don't have to click 4 or 5 times when changing a page.

or

Find with inforapid search and replace for example, all the files with this string: src="http:// and replace it with src="#http://. Then replace window.open with windowopen or any instruction the javascript interpreter doesn't know.
When browsing, no ad is displayed and you don't have to click 4 or 5 times when changing a page.
topTop of the page

With javascript

W3C XHTML 1.0
W3C CSS