<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>lukebaker.org &#187; Work</title>
	<atom:link href="http://lukebaker.org/archives/category/work/feed/" rel="self" type="application/rss+xml" />
	<link>http://lukebaker.org</link>
	<description>lukebaker.org</description>
	<lastBuildDate>Tue, 08 Mar 2011 22:39:20 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>How to stop an Ajax DDOS</title>
		<link>http://lukebaker.org/archives/2006/03/06/how-to-stop-an-ajax-ddos/</link>
		<comments>http://lukebaker.org/archives/2006/03/06/how-to-stop-an-ajax-ddos/#comments</comments>
		<pubDate>Tue, 07 Mar 2006 03:23:46 +0000</pubDate>
		<dc:creator>Luke</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Howto]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://lukebaker.org/archives/2006/03/06/how-to-stop-an-ajax-ddos/</guid>
		<description><![CDATA[The real title should be &#8220;How to stop a Javascript or Flash distributed denial of service attack&#8221;, but that&#8217;s way too long of a title. As nifty client-side browser tricks are being propogated to the masses, they are also causing a few problems along the way. Say you&#8217;re writting a Javascript or Flash enabled web [...]]]></description>
			<content:encoded><![CDATA[<p>The real title should be &#8220;How to stop a Javascript or Flash distributed denial of service attack&#8221;, but that&#8217;s way too long of a title.  As nifty client-side browser tricks are being propogated to the masses, they are also causing a few problems along the way.</p>
<p>Say you&#8217;re writting a Javascript or Flash enabled web application that has the browser send off HTTP requests occasionally.  What can you do when a bug in your application causes clients to send a constant stream of HTTP requests to your poor server?</p>
<p>I recently ran into this type of problem and had to figure out how keep our servers from being overloaded by a number of clients that were sending a constant stream of requests to our servers.  Lucky for us, the URLs that were being requested by these &#8220;rogue&#8221; clients were easily distinguished from &#8220;valid&#8221; requests.</p>
<p>The first step is finding some way to intercept the requests and do something different with just those requests.  For my case, these rogue requests were all being handled by a 404 page, which was a PHP script.  My first idea was to delay the response sent to these rouge clients, by just adding a sleep of a few minutes to this PHP page.  This worked as advertised since the rouge clients were waiting until the each request was completed before sending another request, but I was still looking for something a bit better.</p>
<p>My second idea was to try altering the response for these requests to be a <a title="HTTP/1.1 Status Code Definitions" href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.2">&#8217;301 Moved Permanently&#8217;</a> response instead of a <a title="HTTP/1.1 Status Code Definitions" href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5">&#8217;404 Not Found&#8217;</a>.  My hope was that if I specified a location of say &#8216;http://localhost/&#8217; that all subsequent requests destined for these rouge URLs would instead be sent to the localhost, thereby no longer bothering our servers with these silly requests.  Unfortunately, this didn&#8217;t seem to work for me in my tests.</p>
<p>My final and satisfactory solution was to send some special cache headers in response to these rogue requests.  I added a few lines found in an <a href="http://www.php.net/header#61883">example in the ever useful PHP documentation.</a>  These lines told the Flash script to cache the URL they just requested for 30 days.  Subsequent requests for these URLs were just looked up in the local cache by the Flash script, stopping the steady stream of requests sent to our servers for these worthless URLs.  Success!</p>
]]></content:encoded>
			<wfw:commentRss>http://lukebaker.org/archives/2006/03/06/how-to-stop-an-ajax-ddos/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Slow Office Connection</title>
		<link>http://lukebaker.org/archives/2005/09/26/slow-office-connection/</link>
		<comments>http://lukebaker.org/archives/2005/09/26/slow-office-connection/#comments</comments>
		<pubDate>Mon, 26 Sep 2005 19:08:01 +0000</pubDate>
		<dc:creator>Luke</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://lukebaker.org/archives/2005/09/26/slow-office-connection/</guid>
		<description><![CDATA[Booo for slow office connections, or at least sharing the office connection with other people.]]></description>
			<content:encoded><![CDATA[<p>Booo for slow office connections, or at least sharing the office connection with other people.  <img src="/wordpress/wp-includes/js/tinymce/plugins/emotions/images/smiley-wink.gif" /></p>
<p><img width="596" height="32" src="/wordpress/wp-content/slowspeedcropped.png" /> </p>
]]></content:encoded>
			<wfw:commentRss>http://lukebaker.org/archives/2005/09/26/slow-office-connection/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Reading books</title>
		<link>http://lukebaker.org/archives/2005/04/15/reading-books/</link>
		<comments>http://lukebaker.org/archives/2005/04/15/reading-books/#comments</comments>
		<pubDate>Fri, 15 Apr 2005 12:56:41 +0000</pubDate>
		<dc:creator>Luke</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://lukebaker.org/archives/2005/04/15/reading-books/</guid>
		<description><![CDATA[Instead of having to buy Hackers and Painters, I can just read it online. At work everyone on the internet team got a subscription to Safari Tech Books Online. It is quite nice to have so many technical books at my disposal. I would have eaten this sort of thing up while in college. It&#8217;d [...]]]></description>
			<content:encoded><![CDATA[<p>Instead of having to buy <a href="http://paulgraham.com/hackpaint.html">Hackers and Painters</a>, I can just read it online.  At <a href="http://www.gospelcom.net/">work</a> everyone on the internet team got a subscription to <a href="http://search.safaribooksonline.com/">Safari Tech Books Online.</a>  It is quite nice to have so many technical books at my disposal.  I would have eaten this sort of thing up while in college.  It&#8217;d be sweet if Calvin provided a subscription to every CS student.  It&#8217;d be more worthwhile than that those silly Microsoft / MSDN subscriptions they provide.  Right now I&#8217;m going through <a href="http://search.safaribooksonline.com/?view=book&#038;xmlid=0-596-00281-5">Learning Python</a> and <a href="http://paulgraham.com/hackpaint.html">Hackers and Painters</a>, though I just noticed that <a href="http://rlove.org/kernel_book/">Linux Kernel Development</a> was recently added to the library.  That was the one other book that I wanted to browse through.  Not so much because I&#8217;m interested in kernel development but more so I can better understand how Linux interacts with the hardware.</p>
<p>The library is great in that they have a ton of books.  I think they have all or most of the O&#8217;Reilly books, which is key.  However, the drawback is that the site itself is pretty horrible.  It is horrible from a user interface perspective and performance wise.  Every single action triggers a request to their super slow servers.  Several actions require a few too many clicks which translate into waiting for their servers to respond.  They should allow users to control page navigation simply by using the keyboard.  They also fail to do the basics like remember what font setting I used last.  Needless to say the site could use a major overhaul to bring it into the 21st century.</p>
]]></content:encoded>
			<wfw:commentRss>http://lukebaker.org/archives/2005/04/15/reading-books/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Nutch patch #1</title>
		<link>http://lukebaker.org/archives/2004/09/06/nutch-patch-1/</link>
		<comments>http://lukebaker.org/archives/2004/09/06/nutch-patch-1/#comments</comments>
		<pubDate>Mon, 06 Sep 2004 21:09:10 +0000</pubDate>
		<dc:creator>Luke</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Nutch]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://lukebaker.org/archives/2004/09/06/nutch-patch-1/</guid>
		<description><![CDATA[At work I was told to investigate other options for a search engine that would search just the sites that we host. While I was doing that I came across Nutch. It looked pretty sweet but not quite something that would fit our current needs. We needed a few more features. Currently at work we&#8217;re [...]]]></description>
			<content:encoded><![CDATA[<p>At <a href="http://www.gospelcom.net/">work</a> I was told to investigate other options for a search engine that would search just the sites that we host.  While I was doing that I came across <a href="http://www.nutch.org/">Nutch.</a>  It looked pretty sweet but not quite something that would fit our current needs.  We needed a few more features.  Currently at work we&#8217;re looking at a <a href="http://www.google.com/appliance/">Google Search Appliance.</a>  It costs a pretty penny, but would be nice because hopefully that would be something we could just <a href="http://www.ronco.com/products/rotisserie_std.di4?productID=1">&#8220;set it and forget it.&#8221;</a></p>
<p>Lately in my spare time, I&#8217;ve started trying to add the features to Nutch that would allow us to use it.  It&#8217;s fun.  I recently <a href="http://sourceforge.net/mailarchive/forum.php?thread_id=5515493&#038;forum_id=13068">submitted</a> my first <a href="http://lukebaker.org/upload/RegexUrlNormalizer.patch">patch</a> to the Nutch developers list.  Hopefully I did everything well enough to get it commited to CVS.  This patch allows users to specify Perl 5 regular expressions, which will get applied to all URLs that Nutch encounters.  It&#8217;s useful for stuff like stripping out session IDs in URLs.</p>
<p>I&#8217;ve got a few more features that need to be added.  I found another drawback to the way the crawler for Nutch was written.  You can specify any number of threads to be running at the same time.  However, currently it won&#8217;t allow two different threads to download from the same IP simultaneously.  This is not good considering all of our websites look to the crawler as just 1 IP.  I&#8217;ll probably have to make some changes there.  Hopefully it&#8217;ll be relatively straightforward and easy.</p>
<p><em>Cool use of Nutch: <a href="http://creativecommons.org/weblog/entry/4388">Creative Commons Search</a> (via: <a href="http://www.nutch.org/blog/2004_09_01_cutting_archive.html">Doug</a>)</em></p>
]]></content:encoded>
			<wfw:commentRss>http://lukebaker.org/archives/2004/09/06/nutch-patch-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.230 seconds -->

