Naked Programmer Cam - Postcards - Photos - Radio
Orb of Hotep - Random Stuff - SF Bay Trojans - Totem Tales

Last update: 14 Jun 2003 -- 10:40pm
31 Oct 2002

Happy Hallowe'en!

30 Oct 2002

Ready for Hallowe'en?

27 Oct 2002

Saw Resident Evil on TV last night. ZOOOMMMMMBIIIIEEES!!

I scrapped the indexing part of what I wrote before since the indexer was weak and made errors. I had it working and generating web pages and everything, but it just showed how iffy the indexer was.

A much better indexing problem for my needs is SWISH-E. It is very flexible and lets me create whatever fields I want. I also lets me manipulate what is stored as the doc summary, which is also cool. I'm almost done with the filter pre-processor which turns RFC822-format files into XML-ish stuff which it indexes. (You put fields into Meta tags. Nice approach.)

21 Oct 2002

How hard can making an email indexer be? Not very, as it turns out. Using Perl (using Mail::IMAPClient), I connected up to my Exchange server via an IMAP server it runs. I sucked out all my email into individual files to my disk, dropping all non-html and text portions of the emails (e.g. attachments). (38 lines of code total.) Then I used Namazu to index the mail files (which are all rfc822).

I copied a couple files into the Apache cgi-bin, and poof! searchable email. Another 25 lines of perl code lets me link the search results to the original email (instead of the downloaded copy).

Namazu has a few things lacking. I might go to Glimpse (or some other search engine), but I need to figure out how it handles fields.

17 Oct 2002

Up until now, I've used Foam Totem for things which I think readers (all three of you) will find interesting. A lot of people are using them more like diaries, and I'm going to start doing so as well. Don't worry! No deep, pretentious, inner-most-thoughts stuff is likely. I mean more stuff on what I'm interested in now, things I'm researching, and so on. It'll be geekier. And no one will care in the slightest about what I'm writing, but I'm writing it for myself rather than for anyone else.

Google for your Email

ZOň is a email indexer. It does full-text indexing and also cross-references everything to recipients, senders, dates, companies, and so on. There have been a number of reviews of it recently, calling it "Google for your Email."

Why do I care? My email mailbox (containing mail from 2001 and 2002) is 2 GB in size. Finding stuff in it is very difficult. You can run searches in Outlook, but they are slow. They are slow enough that you only go to them as a last resort.

Google is so good (and so fast) that I go there first to find just about anything. The number of bookmarks I keep is very small. In earlier web days, I used to have many, many of them so I could find info I needed. Now Google replaces most of them. I need the same thing for my email

I installed and ran ZoŽ at home to try it out. Things I find interesting:

  • It runs as a local server which you access through your browser. I see this more and more. Foam Totem works in the same way. When I want to add something to it, I connect with my browser, click on a link and add a new entry. Very simple to use. Works great.
  • It runs in the background all the time. I'm not sure how much advantage it takes of this now. One could imagine it refining the index, finding corellations, cross-referencing with Google and so on in all the extra cycles the machine has. It's rather like the "Data Soup" we imagined for the Bermuda project way back at MapInfo. (Whose stock has tanked rightously, BTW.)
  • The UI isn't efficient when dealing with large numbers of items. The UI is based on Google's, which indexes free-form information. Emails messages aren't free-form, (it has from, to, subject, etc.) and could be listed more efficiently.
  • The UI needs to handle more items since the search engine can't handle boolean searches right now. Everything is OR'ed. For me, this is a showstopper. For anyone who is dumping in all their mail since the beginning of time it will be. There's just too much noise. Google's pages aren't very compact, but if you query well (using boolean and other complex searches), the first page is all you need anyway.
  • It's written in Java. Feh. It doesn't use textual files for templates or anything, so changes to the output HTML have to be done in the source code. So, there's no real way I'm going to be able to play with it.

Made me start thinking about how I would index email...

8 Oct 2002

Launch pics of Space Shuttle Atlantis (STS-112) from the booster. Unfortunately, there's no pics on-ship pics during separation.

Sep Oct Nov Dec
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan Feb Mar Apr May Jun
Pictures of Max - rv's photos - popplers - Snuffy's photos