---------------
Christof Meerwald@blog.www

home
> blog
>> 596

translate to German (by SYSTRAN)

Weblog RDF feed, Atom feed

[previous] / [up] [overview] [down] / [next]

Mon Mar 09 20:59:14 2009 GMT: DNS Whitelisting for edge.cmeerw.net

Sun Mar 08 16:59:41 2009 GMT: Code-generation from Documents

Mon Mar 02 18:15:41 2009 GMT: PPP Packet Truncation by BT

While IPv6 initially appeared to work fine over my broadband connection, it turns that that I am affected by a bug in BT's network (which is bein used by IDNet) resulting in packet truncation of small IPv6 packets, as described by Andrews & Arnold. :-(

Sun Mar 01 19:29:22 2009 GMT: Native IPv6

One nice feature of my new ISP IDNet is that they offer native IPv6 connectivity (in addition to IPv4) - I just had to enable it on my side (by adding "ipv6 ," to my pppd configuration) and it all appears to work fine. So, no more tunneling via SixXS to get IPv6.

Sat Feb 21 19:54:10 2009 GMT: Open Watcom 1.8

The Open Watcom project has released version 1.8 of its development suite today which is now available for download. This release includes a number of improvements in the C++ frontend, although it's still a long way to go before it will catch up with the C++ standard (the full list of changes is available from the release notes).

Tue Feb 17 07:58:57 2009 GMT: msnbot turning evil

As an update to the previous entry, when I created the robots.txt file, I had hoped that msnbot will take action accordingly. But what happened instead is really outragous: the second msnbot requested the robots.txt file, it simply changed its User-Agent header to no longer identify itself as msnbot, but continued requesting exactly the same pages and at the same rate as before.

And yes, to be sure, I have checked DNS records and whois information for the offending IP addresses (65.55.51.34 and 65.55.51.37) to check that they really belong to Microsoft/MSN.

So, this is how the User-Agent change looks in the Apache access_log:

65.55.51.37 - - [16/Feb/2009:15:28:38 -0800] "GET /index.php?title=Special:Recentchanges&hidebots=0&days=14&limit=100&feed=rss HTTP/1.1" 200 28455 "-" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
65.55.51.37 - - [16/Feb/2009:15:30:10 -0800] "GET /robots.txt HTTP/1.1" 200 389 "-" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
65.55.51.37 - - [16/Feb/2009:15:30:10 -0800] "GET /index.php?title=Special:Recentchanges&hideliu=0&hidebots=&feed=atom HTTP/1.1" 200 13171 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 1.1.4322; InfoPath.2; .NET CLR 3.5.21022; .NET CLR 3.0.30618;.NET CLR 3.5.30729;)"

65.55.51.34 - - [16/Feb/2009:15:28:25 -0800] "GET /index.php?title=Special:Recentchanges&hideanons=1&hideliu=1&hidemyself=1&feed=rss HTTP/1.1" 200 13174 "-" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
65.55.51.34 - - [16/Feb/2009:15:28:27 -0800] "GET /robots.txt HTTP/1.1" 200 389 "-" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
65.55.51.34 - - [16/Feb/2009:15:28:27 -0800] "GET /index.php?title=Special:Recentchanges&days=7&hidemyself=1&feed=atom HTTP/1.1" 200 13176 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 1.1.4322; InfoPath.2; .NET CLR 3.5.21022; .NET CLR 3.0.30618;.NET CLR 3.5.30729;)"

Mon Feb 16 21:27:05 2009 GMT: msnbot considered harmful

Or should I say "msnbot's DDoS attach on the Open Watcom web server"? Peter Chapin noticed high server load on the Open Watcom server, so I took a look and found that msnbot was hitting the server hard. In fact, I counted hits from 99 different IP addresses associated with msnbot and the bot appears to be extremely interested in every conceivable variation of the "recent changes" page on the wiki (which, of course, can be quite CPU intensive to generate).

As a first step, I have created a robots.txt file to tell msnbot to slow down (and prevent it from crawling the "recent changes" page). Hopefully, this will improve the situation in the next few hours, otherwise I will have to completely block msnbot from the server.

But one would really expect that a bot (or a network of bots) would be clever enough not to open lots of concurrent connections to a single server and automatically slow down a bit when the server takes a long time to respond to requests...

Sat Feb 14 17:27:08 2009 GMT: Switched broadband connection to IDNet

Sun Feb 08 15:47:27 2009 GMT: Xref header filtering for newscache

Mon Feb 02 18:05:46 2009 GMT: Finally some snow around here

Sat Jan 31 10:23:16 2009 GMT: (Ab)Using OpenVPN

Sat Jan 31 10:05:03 2009 GMT: Twinkle 1.4 Bug

---------------

This Web page is licensed under the Creative Commons Attribution - NonCommercial - Share Alike License. Any use is subject to the Privacy Policy.

Revision: 1.14, cmeerw.org/blog/596.html
Last modified: Mon Sep 03 18:19:55 2018
Christof Meerwald <cmeerw@cmeerw.org>
XMPP: cmeerw@cmeerw.org