lukebaker.org

lukebaker.org

Archive for September, 2004

Popups!?!

without comments

Weird. I’m listening to Warren talk about colors. One of the fake sites he was showing was maternityfashions.com. So I just went there to see what was there. It was just a domain for sale. However the weird thing was I got popups from that site when I closed it, while browsing with Firefox 1.0PR on Linux. I believe those were the first and only popups I’ve ever had while browsing with Firefox.

Written by Luke

September 17th, 2004 at 4:23 pm

Posted in General

Firefox Setup

with 2 comments

After updating to Firefox 1.0PR, I re-tooled my Firefox startup script. The script accepts a URL and if there is no Firefox process running, it will start one and go to the URL that was the input. If there already is a Firefox process running it will open the URL in a new tab in the already running Firefox window. Previously, I used it every time I launched Firefox. This is not necessary anymore, now it is only used when I click on links from other applications. This is probably only useful for those using Linux.

I also had to change my userChrome.css file so that I could change the width of my search-bar. Unfortunately, the directions which I had previously used were old and didn’t quite work.
Read the rest of this entry »

Written by Luke

September 15th, 2004 at 1:36 pm

Posted in General,mozilla.org

Nutch Shorterm Goals

with 2 comments

  1. Ability to use regular expressions for URL substitutions.
  2. Allow users to to search using url:Store/View/Product/1001
  3. Faster crawling of websites that look like one (1) IP address.
  4. Some sort of templating engine for creating search results pages. Maybe use Velocity?

Written by Luke

September 7th, 2004 at 7:18 pm

Posted in General,Nutch,Projects

Nutch patch #1

without comments

At work I was told to investigate other options for a search engine that would search just the sites that we host. While I was doing that I came across Nutch. It looked pretty sweet but not quite something that would fit our current needs. We needed a few more features. Currently at work we’re looking at a Google Search Appliance. It costs a pretty penny, but would be nice because hopefully that would be something we could just “set it and forget it.”

Lately in my spare time, I’ve started trying to add the features to Nutch that would allow us to use it. It’s fun. I recently submitted my first patch to the Nutch developers list. Hopefully I did everything well enough to get it commited to CVS. This patch allows users to specify Perl 5 regular expressions, which will get applied to all URLs that Nutch encounters. It’s useful for stuff like stripping out session IDs in URLs.

I’ve got a few more features that need to be added. I found another drawback to the way the crawler for Nutch was written. You can specify any number of threads to be running at the same time. However, currently it won’t allow two different threads to download from the same IP simultaneously. This is not good considering all of our websites look to the crawler as just 1 IP. I’ll probably have to make some changes there. Hopefully it’ll be relatively straightforward and easy.

Cool use of Nutch: Creative Commons Search (via: Doug)

Written by Luke

September 6th, 2004 at 5:09 pm

Posted in General,Nutch,Work

Danger!

without comments

Frances. Looks like Miami has a chance of being spared again.

Update:Article about evacuations. Max Mayfield, the Hurricane Center Director who is quoted, used to be (still is?) my neighbor in Miami. Here’s another forecast picture. This thing looks pretty big, so even if it misses Miami with a direct hit, it still might do some damage. Yikes!

Written by Luke

September 1st, 2004 at 9:36 am

Posted in Family,General,Miami