Wednesday, December 22, 2010

why I'm excited by UC's extensible text framework

This year, I've basically migrated from being a Microsoft platform-oriented programmer to one working more with open source software. The game-changer for me, without a doubt, was my deepening use of Amazon Web Services generally, and EC2 specifically. EC2 gave me an easy-to-use development environment in which I could experiment with applications, and with extending applications.

One resource that I have built, in my own time, is an online Naval Reactors database. (I worked as an operator in the program for six years, qualifying on two reactor plants, and have continued reading about the program since leaving it.) I first built it using the AWS SimpleDB database and provided online access to the database using ASP.NET. But with EC2's availability, I happened across a tool that I've really gotten excited about - the University of California's Extensible Text Framework. XTF enables an institution or an individual to create a digital repository. It serves Encoded Archival Description (EAD) XML quite nicely, though I don't use EAD in the Naval Reactors database. It also supports the discovery and presentation of other digital formats, including photographic images.

Here are some URLs that show XTF in action:

-A search that retrieves all database objects

-A search that shows hits-in-context based upon metadata contained in image files

(The database migration is still in progress, but these URLs show the basic functionality.)

What was required to make this work? Downloading XTF and getting it running on an EC2 server (I am using Fedora 8 for the OS). Then, I went through the tutorials to become familiar with customization options. I also wrote some handlers for some additional file formats (JPEG, PNG) that weren't supported by the XTF software as-is and began building the database.

In short: out of the box faceted search; an elegant search and presentation system; and a solution that's being extended by a growing community of users. I am hoping to present on this work at the mid-2011 Code4Lib Northwest meeting.

No comments:

Post a Comment