Thursday, April 10, 2008

Why there is room for doubt about Warman and the Cools post

Updated at the bottom.

In Sept. 2003, someone posted an ugly racist slur about Senator Anne Cools at the racist freedomsite.com. The poster's IP was recorded by the site's software and subsequently recovered and released by the site's owner, Marc Lemire: 66.185.84.204, which has the hostname wc09.mtnk.rnc.net.cable.rogers.com.

This very hostname reports important information that has been overlooked. First, the "wc" in wc09.mtnk abbreviates web-cache; this IP is web-cache no. 9 of a bank of caching proxy servers in Newkirk Road, Thorn Hill: mtnk presumably abbreviates Metro Newkirk. (Rogers has an office there.) Web-caching proxies are used by Internet companies such as Rogers to save bandwidth -- copies of web-pages are stored on Rogers' proxy, and only if a page is not stored there does traffic get forwarded to a website; when the traffic gets there, however, the IP that is logged is that of the proxy, not of the original requester.

66.185.84.204 is one of 42 such proxies in the Rogers stable, and they were (are?) arranged in three banks of fourteen at Wolfedale in Mississauga (abbreviated to wlfdle), York Mills in Toronto (ym), and Newkirk in Richmond Hill (mtnk) (see here). Rogers routed all its traffic though these servers -- or at least the traffic of all users who used the proxying function. (An old Rogers FAQ, quoted here, encourages its use, but also gives instructions on how to turn it off if it isn't helping.)


That 66.185.84.204 was a web-caching proxy explains a lot. It explains, for example, why it is so easy to find the IP being used in these months: dozens of different users of this IP can be identified for every month, including September, October, and November 2003, when the Cools post was made. For your own curiosity, take your own IP (it is listed in the widget to the right) and google it. Do you see any traces of your own surfing? I can't find any of mine. Why are hundreds of instances of 66.185.84.204 so easy to find? Because as a proxy it was through-putting hundreds or thousands or tens of thousands of times more traffic than an IP assigned to an individual subscriber.

This also explains why we find such a wide variety of individuals using this IP in the months that the Cools post was made. Some of them could read Vietnamese (here and here), Tamil (here), Korean (here), and Danish (here). And they seem to come from a wide variety of locations in Ontario: London (here and here), Waterloo (here), Aurora (here), and (as we know from the present controversy) Ottawa. Again, this is because it's a proxy -- and these are all areas served by Rogers and so any traffic from them might be routed through one of Roger's proxies.

That 66.185.84.204 is a proxy is clear, as is the fact that there seem to be only 42 such proxies to serve all of Rogers' customers (they're listed here).

So, what does this mean about the Cools poster? One might be tempted to divide 42 proxies into the number of Rogers customers to come up with a notional pool in which the Cools poster resided. It seems clear, however, that individual subscribers are not tied to a single proxy, or indeed to a single bank of proxies. I began this investigation after noting that IPs in the 86.185.84.xxx-range often shift quickly back and forth, a phenomenon that I tried to map out with the string-ball to the left (see here). These shifting IPs, however, are two of the banks of web-caching proxies (the wlfdle and mtnk), and the shifting of proxies is the necessary and natural result of Rogers' effort to achieve "load balancing", by which traffic to and from Rogers servers was rerouted to keep traffic moving efficiently (see here).

Although the proxies can change, they do not seem assigned randomly, either. The IPs associated with the prolific editor of numerous Thomas the Tank Engine articles in wikipedia (here, here, and here) seem to suggest that a Rogers subscriber had a "home" proxy; that he might be shifted by Rogers to another proxy for load-balancing, only soon to return to his proxy "home".

So, how many potential Rogers customers might have made that racist Cools post? Probably all of them. As we have seen (here), the proxies are not geographically limited; each seems to be able to serve all areas of the province.

Now, according to this, Rogers had 800,000 internet subscribers in March 2004. The same link states that 90% of cable subscribers are in Ontario, which implies a pool of about 700,000.
The Cools poster could be almost any one of them.

Update. The best previous attempt to explain this matter from a technical perspective was that of Lance at Catprint in the Mash (his work was copied, pasted, and embraced at FreeDominion). He has now withdrawn his explanation in favour of mine here, and in the comments to this post.

Update 2. This post establishes that 66.185.84.204 was a widely used proxy, thereby removing the circumstantial case against Warman. For a discussion of evidence that shows that it is someone else, see Why Warman is probably innocent and Why Warman is probably innocent, part 2: Rogers Hi-Speed Internet.

Update 3. I have now acquired copies of actual logs that Klatt used in his testimony, and these prove that 90sAREover's computer was different from the one Warman used: see here.

Comments are open, but please see my Comment Policy. Comments that fail those standards will be rejected.

22 comments:

Mark Francis said...

Given how reliable your work is, Warman seems to be in a good position.

One has to also wonder of the veracity of the source of the information. There's no evidence that I haven't seen so far that can't be simply typed up.

Somena Woman said...

Hey Buckets.

Thanks for doing all this work. It seems a pity that people are working themselves up into a frothy frenzy over the news that Warman has sued, and very few cool heads are prevailing. I've taken a look at the writ that Ezra has posted on his blog in PDF format. I've made several posts in the last 48 hours about the technical aspects of law that I observe in this process.

I think that it is far from likely that the defendants will win in this case. I have not decided which side of the fence I am on... The water has been muddied so much.

Your diligent research is greatly appreciated (at least by me) for adding real facts to the situation instead of wild speculation.

buckets said...

Mark. There are some details in Klatt's testimony that seem to me to point towards the basic reliability of the information. He mentions, for example, a shift back and forth between 66.185.84.200 and 204. In light of what I've posted here over the coming weeks, that is believable, and not something that he could make up, even if he were inclined to do so.

buckets said...

sw. nice to hear from you. keep well.

James Bow said...

Good work.

You know, if you're interested in joining the Blogging Alliance of Non-Partisan Canadians, you're most welcome to.

Dr.Dawg said...

I've linked you over at my place, thanks to MWW who pointed out your heroic labours. Fine work!

bigcitylib said...

Lance also thinks you've got it pegged:

http://www.catprint.ca/blog/blog/misc/meaculpa.html

lance said...

Buckets, excellent work.

I've made a retraction on my original post regarding my mistaken assumption of a DHCP assigned IP and made a new post linking to this site.

Good on you.

Cheers,
lance

Ottawa Watch said...

Those this is why, years ago when I had Rogers, I could host a (trilobite)website on my own computer using my IP number number?
Canada's defamation laws clearly favor the plaintiff. That's why the bloggers had to be so careful with their allegations. They weren't.

erasmus said...

There is room to doubt. Sure.

But there are a few things not in the analysis here. The fact is, none of all these extra people behind this one IP have been proved to post on a that particular message board or blog that resembles anything close to what 90sareover posted or where he posted it. Thus, I think it might be a bit trickier than saying it was a common IP.

Further, the meta details of broswer, OS type, etc are the same. And using "Mozilla 4" in 2003. This could get a bit tricky for a judge to sort out.

I am not suggesting anyone wrote it or not. But that there is more evidence than merely that there was a large pool of people here/

The legal test is not certainty; it is balance of probabilities.

Mark Francis said...

The libel laws here need reforming, but with people seemingly abusing free speech like this, it's going to get harder to make the case for reforms.

*sigh*

Mark Francis said...

Erasmus,

That evidence is still too dilute to matter. If that IP is rarely found on FD proves nothing. As the IP comes from a huge pool of possibilities, even after factoring in reductions for the other criteria, according to Lance's original math, 12.1% of system configs out there at that time match the one in question, leaving 84,700 possibilities out of the pool of 700,000.

As for comparing posts to determine author based upon style... good luck.

Clearly, with this information, under a balance of possibilities, Warman is in a good position. To nail him, you would have to trace so close to him that probability would state that it less likely to be anyone else.

I know of one online libel case where a defendant claimed to not be the author. The IP was traced to a router. There were three people on the other side of the router who could have written it. That was a problem for the plaintiff, until enough evidence was found to show that the other two were less likely than the defendant. What was critical was that the IP was at least tied to a location. That is not the case here.

During the time in question, I was doing posts to FD using Rogers with the same system config as with the Cools post. I did not write the Cools post or anything like it, but there are others I did write which may show up under the same IP. If so, and given that Warman has supposedly admitted doing the Lucy posts, and given that we both deny writing the Cools post, please, prove which one of us did it. Before you answer, throw in the other possible 84,000 systems involved we know nothing about.

See the problem?

erasmus said...

All I am saying is that a judge will weigh other factors besides just a pool of people. I have no doubt that people from that IP address posted on FD. Sure, there may be a pool of 70,000. Only Rogers knows for sure.

But the question is not one of certainty, as I feel your arguments seem geared towards, but to a balance of probabilities, which is what a judge will look at. All it takes is a judge to say that it was more probably than not. Just given the surrounding facts, even including your data, I think a judge could go either way on it. Thats all.

jaycurrie said...

You raise good point Buckets. As I pointed out in my post linking you the Statement(s) of Defense are going to have to come to grips with what you are saying.

However, I am a bit confused about one matter: cache servers, as I understand it, store the pages users access "locally" so that the lag time can be reduced.

Thus, if I commonly go to SDA or the Lying Jackal Kinsella's pages a copy of those pages will be stored on the cache server.

What I don't quite get is how, if my understanding is correct, the cache status of the server matters one way or another in terms of Warman's alleged activities.

Assume for the moment that Warman had (as we know he had) accessed Stormfront before. The cache server saved the page. The alleged Warman computer was hot to post a little screed on Senator Cools and called the page. It got the page from the cache server and then posted the filth. Which was passed on from the cache server.

Nothing abnormal about the tech. And nothing in the least bit exculpatory as to the post.

But, as I say, I am not that tech savvy or network knowledgeable.

bigcitylib said...

Buckets, I know you've mentioned the OS/Browser combination thing occasionally, but how does that fit in?

Specifically, how would it be possible to gather from the record on Stormfront that the computer issuing requests THROUGH the proxy had x or y config?

Would this information not be about the configuration of the proxy?

buckets said...

Erasmus. Yes, the judge will look at other factors, including the number of other people who Rogers was sending through this IP. But is there any good reason other than the IP-link to believe this is Warman?

buckets said...

Jay. Thanks for your comment. I'll answer the technical side first on these cache servers. Say you were a Rogers customer. If your computer was set up to use the proxying function (and not all were), and you sent your browser to Kinsella's page, your request would go down the pipe to the server, and it would look to see if it had the page or parts of the page in its cache. If it was there, it'd send you the page. Kinsella would never know that you'd visited.

If the page was not in its cache, the proxy would send the request on to Kinsella, but now the headers are rewritten so that it looks like the request was coming from the server instead of from you. (If you're interested in a better explanation and a concrete example of how Rogers' proxies rewrite the header, see here).

buckets said...

Jay. Continuing. If Warman had posted that, the post might have been sent through 66.185.84.204. But if you read Klatt's testimony (which I've reproduced in this post), there were two IPs involved in Cools post, 66.185.84.204 and 66.185.84.200. (This is a result of load balancing, which I discuss here) And I've shown in multiple posts that these proxy IPs move around under people quite a bit (see here -- something that the ball-of-string is an attempt to replicate. But if there are 10s of thousands of people using each proxy every day, and the proxy that forwards any specific post to a site is fluid

buckets said...

(hit post accidentally; picking up where I left off)
But if there are 10s of thousands of people using each proxy every day, and the proxy that forwards any specific post to a site is fluid, the culpatory value of someone having the same IP over a three month period is greatly diminished.

buckets said...

BigCity. I'll write up a post about this.

bigcitylib said...

Buckets,

I guess my point is, do we know those are the settings from the home machine (not, for example, the proxy). Although I imagine a proxy would NOT even be equipped with a browser and etc.

buckets said...

BCL. Yes, when the Rogers proxy rewrote the packet headers, the user-agent details (os, browser, etc.) were left unchanged.

Not all proxies do this. Compare the widget here, with those this proxy which forwards user agent details and this one that fakes them. (The latter two are free proxies, which are set up for cloaking traffic, not web caching.)