A website copy with HTTrack
Martin Luther King June 2002
project name: MLKWeb(URL) address: http://seattletimes.nwsource.com/mlk/
amount of time: 1 hour (56k modem)
To limit the size of the capture, add in the scan rules
-*.aiff
problems:
Ads and sound filesOther examples with similar difficulties: Martin Luther King 2004 | Recycling | Extreme World | Herberton | Canobie
solutions:
Missing files
At the end of the capture, find the files which were not downloaded and which interest you.
If http://seattletimes.nwsource.com/mlk/sound/promised_resample.wav is missing,
1. create the folder sound if it does not exist in the capture.
2. Using the URL, download it with dlexpert, for example, then copy it in the folder sound.
3. Edit the file your capture folder/project name/seattletimes.nwsource.com/mlk/man/MLKsound.html and replace A HREF = "http://seattletimes.nwsource.com/mlk/sound/dream_resample.wav" by A HREF = "sound/dream_resample.wav".
4. Do the same for the other missing files.
2. Using the URL, download it with dlexpert, for example, then copy it in the folder sound.
3. Edit the file your capture folder/project name/seattletimes.nwsource.com/mlk/man/MLKsound.html and replace A HREF = "http://seattletimes.nwsource.com/mlk/sound/dream_resample.wav" by A HREF = "sound/dream_resample.wav".
4. Do the same for the other missing files.
Several links such as the photograph of MLK on the home page remained absolute.
The capture should be all right now.
1. Check the presence of the file in the capture.
3. Do the same for the other absolute links.
a. If it is not there, connect and visit the page.
b. Copy the image in the capture folder corresponding to its properties (click right then properties).
2. Modify the link in the HTML page (your capture folder/project name/seattletimes.nwsource.com/mlk/index.html for the photograph of MLK).b. Copy the image in the capture folder corresponding to its properties (click right then properties).
3. Do the same for the other absolute links.
The capture should be all right now.
Ads
Find with inforapid search and replace for example, all the files with this string: src="http:// and replace it with src="#http://.
For this capture, as the images of the site partners are in the folder ads.nwsource.com/Realmedia/ instead of seattletimes.nwsource.com/Realmedia/, you can replace src="http://seattletimes by src="../../ads then add ../../or ../../to src="../../ads in the sub-directories according to their depth. It is a less radical method but it takes much more time.
When browsing, no ad is displayed and you don't have to click when changing a page.
Quiz
The Quiz can be captured but not checked offline.
By visiting the site you can load the HTML page with the answers.
Save this page and link it if necessary.
By visiting the site you can load the HTML page with the answers.
Save this page and link it if necessary.