The hosts with very high hit rates appear to be HTTP proxy servers which are relaying requests from hosts behind them. Here's a dump of a packet from the host at the very top of the heavy hitters report:
|
There is, however, another point. To the right is screen-capture of the last line of the packet quoted above. In it is identified the server software that is doing the proxying (see the red arrow): Netcache by Network Appliances (NetApp).
This is something that we see elsewhere. A Thai web-directory, truehits.net, provides a variety of services for its customers, including daily traffic statistics that are available on the web. One reported statistic is top proxy servers, and the same proxy server discussed by fourmilabs in January 2004 was listed by truehits as one of top proxy-servers to visit sunncity.com on April 19 2003, Dec. 23 2004, and Dec. 26 2004.
The screen cap on the left is from April 19. It shows, not only is the same server software being used (see red arrow), but the exact same version (5.1.1R1D9), and (since it is set up to "Forward IP") with much the same configuration.
Now, according to the fourmilabs logs, the forwarding was being done for 24.153.59.84 (a Rogers IP). In the case of this truehits log, we can again identify the specific IP of the individual user, not merely the proxy. Note that the proxy had forwarded 63 hits (green arrow).
Among the top daily visitors (see right) is 24.102.79.243, which is marked here as NETBLK-RNS-EAST. (To judge from this, RNS abbreviates Rogers Network Services.) And the number of hits is identical to the proxy's. (It is only to be expected, of course, that an small site in Thailand might only receive one Rogers customer in a daily log.)
The important point for us is that Rogers was using this IP as proxy for all of 2003 and 2004.
One final point. In most cases the proxy hides the individual IP, and the IP that appears in the logs is that of the proxy, not that of the individual Rogers customer. In the case of the fourmilabs logs, of course, Mr. Walker has fished out the individual IP for us. Normally, however, we don't see the individual data, but we see details about his browser and operating system ("user agents"). Consider these examples from two log entries for the same proxy (66.185.84.74) during the period when this proxy was in use:
- Tue 29-June-2004 08:07 - wc08.wlfdle.rnc.net.cable.rogers.com [=66.185.84.74, ed.] - "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" - "http://search.yahoo.com/"
- 66.185.84.74 - - [25/Sep/2004:09:59:00 +0200] "GET /downloads/hovtext/1.0/HovText.exe HTTP/1.1" 200 1136640 "http://hovklan.com/hovtext/index.php/?page=download〈=en" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; FunWebProducts; SV1; .NET CLR 1.1.4322)"