I've just had the distinct pleasure of blocking a nitwit who was using SiteSucker to snarf a whole copy of my site.
Let me explain in bold, since it might not be easy to understand by the sort of people with limited mental resources that use this sort of tools: mirroring my site is a futile exercise, since you will get tens of thousands of pages consisting of all revisions to the Wiki, CVS, etc., 90% of which are useless.
Besides, you are wasting my bandwidth and making my feeble server's loadaverage shoot through the roof (it was up at 70.5 when I finally managed to login).
That person's IP address is 220.127.116.11 (which is part of RCN's assigned IP block), and I have just sent e-mail to abuse(at)rcn.com concerning this. I know they also have proxy server logs, because the initial address I blocked had a reverse DNS record that named it as a proxy cache.
Right now, the whole of RCN's address space is blocked off as well. A bit excessive, I know, but utterly effective.
On the Internet today, masquerading as a dog is a futile exercise. But being an ass is a dead giveaway.