Tuesday, June 7, 2011

customizing stop words in XTF

I've been immersing myself in the inner workings of the California Digital Library's XTF platform. I expect to make a number of changes to my XTF-based Naval Reactors History Database service in the next few months, in preparation for a fall LITA Forum presentation. The change described in this post is actually pretty trivial - adding a customized stop words list for an XTF instance - but it illustrates the kind of back-end customizations that are possible.

I decided to use the stop words list provided on the SEO Tools website.
To employ the index in XTF, I copied the file to xtf/conf/stopwords directory, replacing the existing stopwords.txt file that was included in the release version of XTF with the one that I obtained from the SEO Tools site.

I then stopped Apache Tomcat and rebuilt the XTF index. A clean build is recommended, as described in this XTF users group post. (I received the error described in the message before restoring to a clean build.) Upon restarting Tomcat, the new stop words list is in use.

No comments:

Post a Comment