The Opaqueness Of Proxies

I just spent a fascinating half an hour running Ethereal on both ends of a connection to my site to try and figure out why NewsFire wasn't getting partial RSS content, and ended up comparing raw HTTP requests of all sorts.

If you've just tuned in, I coded RSS "diffs" based on the If-Modified-Since header just this weekend, and I went around testing it with all sorts of RSS readers. So far, only NewsFire had trouble with it, and, as it turned out, only from Netcabo.

I swapped a couple of short e-mails with David Watanabe to confirm he used If-Modified-Since, tried logging some extra environment variables to figure out what was happening at the server side, and after a while I came to the conclusion that (unsurprisingly) some HTTP requests did not reach my server. And when they did, they occasionally came (but not always) from a different IP address than mine, with a set of Via: headers denoting the request was being performed in my behalf.

And, guess what? The If-Modified-Since time was nothing like what I had sent - it was almost exactly one hour in the future. And it comes and goes of its own accord, although it seems to have stopped doing funky time travelling since I got NewsFire 0.23 on this Mac, which presumably messes up some internal hashing and appears to be a "different" client to the damn proxy - that's a good a theory as any, since there are no differences in the HTTP requests other than the new User-Agent.

I'm too damn tired to go through the aggravation of filtering the packet dumps, saving them as text and doing a diff to show this "time warp" effect, but I'll make a note to do this with tethereal and a little script later - if I can sync traces at both ends and compare them automatically, it will be pretty damn useful for some tests at work too.

I also won't bother calling up customer support for this - their standard tack is to:

  • Initially deny knowledge of any such thing as a transparent proxy.
  • Forward the call to "engineering" (lowercase 'e' written on purpose), who try to gauge your technical skills before changing from denial to saying it causes no problems whatsoever.
  • Blatantly deny it's activated at all until you prove them it's there.
  • Say it will be fixed momentarily.

Oh, and in case you're not Portuguese - there is no other cable provider (for all practical purposes, these guys have a monopoly, and a regulator-stalled one at that), this one is owned by the incumbent telco, and they are a rampant commercial success.

They are not, however, a company I would recommend dealing with technically at this point. And I won't name the cache vendor - after all, they are helping an ISP break the Eleventh Commandment, and don't deserve the publicity.

I wouldn't put it past them to for this to be their bug, though.