The Digg effect and what I could have done

Yesterday I received a lot of traffic, and a lot it is still quite of an understatement. When I realised where all this traffic was coming from it was already to late and only a bunch of cached pages withstood the enormous flow of visitors — at least for a while. I felt victim to the digg effect and sixteen thousand hits later I knew what I could have done to keep my blog alive.

In the evening my shared host provider sent me an email in which he explained that they though there was an DoS attack directed at my page and that they had to move me on an isolated server, after a quick email discussion and me telling them that it was in fact the digg front page what caused DoS like traffic they put me back on line.

There are few things I could have done to prevent the site from going down, the best probably being getting a root or managed server instead of an shared host but that’s not a quick solution and as I think this is probably an isolated case not really necessary. Now the support team of my hosting provider gave me this great tip on how to get rid of all that incoming digg traffic which was taking down my site with some short two lined addition to my .htaccess-file.

RewriteCond %{HTTP_REFERER} ^http://(.+\.)?digg\.com/ [NC]
RewriteRule .* - [F]

These to lines will drop incoming traffic from digg and show them a forbidden instead of forwarding them to my index.php where a lot of load is produced executing or accessing php files. Some people will assume the site remained done, some will be just confused why they’re forbidden to access the site, as I know digg some will shout censorship, and those who know what happened will just enter the URL in their address bar and access it this way as only traffic from digg is blocked. Anyway, I thought there is maybe a more elegant way, which I didn’t use because I did not know what Wikipedia’s policy would say about this and to be honest I’m just too lazy to dig it up. It is possible to to redirect incoming traffic from a specific source, in this case digg to a completely different URL and one that would suit quite well is the Wikipedia entry about the digg/slashdott/etc. effect.

RewriteCond %{HTTP_REFERER} ^http://(.+\.)?digg\.com/ [NC]
RewriteRule .* http://en.wikipedia.org/wiki/Digg_effect$1 [r=301,nc]

Well those two are what I could have done, but when I noticed what was happening most of the traffic has already had happened and at this point there was nothing I could do.

I think I was quite lucky that I installed the wp super cache add-on for WordPress which cached pages, making static URL files being served instead of executing the PHP files every single time which reduced the load at least to some extend but it clearly wasn’t enough and the server struggled every time a new page has to be cached.

Is there anything else more I could have done? I love the idea of a WordPress plug-in that puts those lines in my .htaccess every time a set number of visitors from digg or other social media sites tried to access the page, I don’t even know if this kind of automation is even possible though.

You might also like

3 thoughts on “The Digg effect and what I could have done

  1. Interesting post about the Digg Effect, as far as I can advice is not to dynamically edit your .htaccess because this means you have to have dodgy file permissions.

    Personally id redirect to google cache (if generated) or remove all aspects of page except content and advert and redirect to that.

    WordPress uses large amount of server requests, even when using wpcache such as javascript files and images.

    Adam @ zend.is-hacked.com

  2. @Adam: Thank you for your comment, redirecting to google cache is a brilliant idea, even if as you mentioned it will only work if has been generated already.

    I’m not sure about the .htaccess however, is there any risk if it is owned by the same user as WordPress (user in the unix-sense)? I can see that there is a problem with chmod 777, but chown shouldn’t make a difference I think.

  3. I personally would use the following rewrite rules:

    <Module mod_rewrite.c>
     RewriteEngine On
     RewriteCond %{HTTP_USER_AGENT} !^CoralWebPrx
     RewriteCond %{QUERY_STRING} !(^|&amp;)coral-no-serve$
     RewriteCond %{HTTP_COOKIE} heavyloaduser=true [OR]
     RewriteCond %{HTTP_REFERER} slashdot\.org [NC]
     RewriteCond %{HTTP_REFERER} digg\.com [NC,OR]
     RewriteRule ^(.*)$ http://%{HTTP_HOST}.nyud.net%{REQUEST_URI}  [R,L,CO=heavyloaduser:true:%{HTTP_HOST}]
    </IfModule>

    Which will rewrite all traffic from digg to coralcdn (http://www.coralcdn.org/). It is possible to keep the comments dynamic and uncached by adapting these rules slightly, adding in something like:

    RewriteCond %{REQUEST_URI}      !^/comment\.php

    I am unsure of how wordpress handles comments, so this may/may not work.

    Hope this helps :)