Website Mirrors

This page links to a few examples of site copies with problems (using the default options gives good results with most mirrors once the browser identity has been set - if your problem is not listed below, have a look at HTTrack help page). These issues, due to a recent visit, WinHTTrack (or other website copiers) limitations, site conception or authors wanting to protect their work, can sometimes be fixed.
When sites are protected, you can try later. As robots are used by search engines to index pages, protections and limitations are often removed.

When a site can be captured, the solutions listed here work with WinHTTrack 3.22 (and newer versions), but they may be used with Webzip, Memoweb, WebCopier...
The examples should allow you to determine if a website using PHP (Personal Home Page or Hypertext Preprocessor - .php), Perl (.pl), CGI (Common Gateway Interface - .cgi), ColdFusion (.cfm), Active Server Pages (.asp), java (.class), javascript (.js), CSS (.css or .htc) and Flash (.swf and .dir) can be mirrored.
Java applets | Menus | Links | Javascript | Flash | Corrupted files | Missing files | Extension | Huge site | Ad banners | Browser | Web Spider Traps | PHP (asp, cfm)

As I couldn't find how to use Netscape, Konqueror or Mozilla cache, I used the Internet Explorer cache to get the MIME type or fetch missing, corrupted or filtered files as MSIE stores by default all the files in the folder Temporary Internet Files.
But since the 30/09/2007, the utility MozillaCacheView or the add-on CacheViewer can be used as they allow you to read and extract the files in the cache of Firefox or Mozilla.

AttentionBefore replacing any string in the mirror, make a copy. This copy will allow further updates or undoing. Do not modify the cache (folder hts-cache for WinHTTrack, file local.web for Memoweb, etc.).

Java applet

Examples: Adventure City | Herberton | Firstenergy
- When setting options, do not tick DOS names or ISO9660 names. As a matter of fact, .class file will be saved with a .clas extension, so they will not be executed. You can fix extension problems by renaming the files and using a utility as inforapid search and replace.
- Use the Internet Explorer cache if you have a gray box.
topTop of the page


Examples: Cedarpoint | The Engine Room | Gulliver's Theme Park
- Sometimes "Pull Down or Scrolling Lists" calling .php, .asp or .cfm files do not work although all necessary files are present. A javascript function and a few modifications will disable the "Submit" or "Reset" button, but choosing an option should execute the option file.
- Find for each "option value" the name of the file given by WinHTTrack, then replace the value by the file name and use the function described here after naming or renaming the form.
* Javascript, flash, php, asp and cfm often prevent the capture of a whole site.
topTop of the page

Absolute links

Example: Adventure Planet | Martin Luther King 2004
- With Active Server Pages (or .php or .cfm) , some links stay absolute even if corresponding html files are in the local site. First try to update the capture without aborting it. If it doesn't work, any Find and Replace Utility can fix it. Find all the files with the string .asp"(or .php or .cfm) and modify the absolute link in all the mirror if a file with the same name (name.asp or .php or .cfm) but with an html extension is on the hard disk.
topTop of the page


Examples: Alton Towers | Herberton | Extreme World | Martin Luther King 2004
- For pop-ups or computed file names you can modify functions or use Temporary Internet Files.
- When a capture doesn't work if a website hasn't been visited recently or on another PC, you can use the Internet Explorer cache.
- For menus, add or modify functions.
- For image galleries, use Temporary Internet Files.
- If external js (or css, htc) files are missing, add the file names in the Web Addresses to mirror. Convert absolute links into relative links (if the version of Httrack you use did not do it).
* Javascript, flash, php, asp and cfm often prevent the capture of a whole site.
topTop of the page


Examples: Discovery Cove | Alton Towers | Ratanga | The Engine Room | Wild Waters Park | Rapids Water Park | Gulliver's Theme Park | Camelot Theme Park
From version 3.21, WinHTTrack explores swf files. Download the last version.
If files are still missing, here are a few methods:

1. Visit the missing pages. Then use the Internet Explorer cache (Temporary Internet Files) to find the missing files, copy them in the local site and delete the figure between brackets ([1]) added by the explorer.

2. After downloading a swf file, note if links are absolute or relative in the address bar of your browser.

If they are absolute and the Flash file protected or compressed, then the capture will be incomplete unless you can save them uncompressed with SWFRIP.
If links are relative, copy the names of URLs called and add them in the scan rules.
If the animation calls .asp, .php or .cfm files which exist in the mirror with a html extension,
- copy the file and rename it with an asp, php or cfm extension.
- or write a file in HTML with an asp, php or cfm extension which will redirect to the mirror html file.

3. Download the utility swf2html.exe from Macromedia website.
It extracts some php and html links (use option -s "txt|js|php|any_extension" between outputfilename and inputfilename)- for asp or cfm use method 1 and/or 4 if the links are not extracted.
The html file produced gives the list of the links (Command-line: swf2html.exe -o outputfilename.html inputfilename.swf).

If they are relative, add them to the scan rules or Web addresses (with full path name). When the extension is not html or htm, use method 4 after the capture.
If they are absolute, add them and modify the links in the Flash file with method 4.
When the links do not exist (example Ratanga), use method 1 or 2.

You can also extract the links with SWFRIP (the list is in file actions.txt) and use the same method.

4. A daredevil can use a hex-editor.

If links are relative, open the swf file, search for .htm, .html, .asp, .php and .cfm. Then, modify extensions if the files are in the capture, note file names to add them in the scan rules...
If they are absolute, replace them with relative links if there is enough room (replace unwanted characters by spaces). Alternatively, you can redirect towards a html file that will redirect to the captured link.
- Optimized and protected .swf files do not show the links. If saving them uncompressed with SWFRIP does not allow to modify them, copy the .htm files of the capture and rename them with the extension found in the Internet Explorer cache. A bit of luck, and...

topTop of the page

Corrupted files

Example: Discovery Cove
- Download them with DLExpert, for example, then copy them in the right folder.
topTop of the page

Missing files

Examples: Martin Luther King 2002 | Martin Luther King 2004 | Recycling | Extreme World | Herberton | Canobie
- Download them with DLExpert, for example, then copy them in the right folder.
- For image galleries, use Temporary Internet Files.
- Select "no robots.txt rules".
topTop of the page


Examples: Adventure City | Yorba
- Sometimes, WinHTTrack downloads a file and gives it a bad extension. Associating file types and MIME identities may cause more problems. For a few files, visit the webpages, check the file types in the Internet Explorer cache.
- Sometimes, authors give wrong extensions.
When images are concerned, an image viewer such as irfanview will check and rename those files. Then a utility as inforapid search and replace will edit the HTML files referring to these images.
topTop of the page

Huge site

Example: Marian High | Areaparks | Kakadu
- You can download using multiple sessions, then merge by hand all of them.
- Use filters.

Banners, ads

Examples: Travel West | Areaparks
- Try setting Option / Structure / no external pages.
- There are a few methods to get rid of ads and banners blocking page display or forcing you to click when you want to read another page.
topTop of the page

With javascript