Categories
Computing Java Web Development

Lucene and friends

As my current project draws to a close, I need to deliver document search capabilities.
SQL Server, despite how much I love it, requires you to have the documents reside *inside* the database for it to index them. That is ugly. So I ventured out and went with much-hyped and totally-cool Lucene from Apache. I am an Apache bitch, and why not, I love them and they love everyone.

So I ran some tests and everything works great and I can sleep at night.

Today, Saturday, after we release the first iteration of the live site and I am in post-stress bliss, I discover that Lucene does not index PDFs out of the box. Text is cool, PDF ain’t cool.

Rapid searches turn into slow searches, during which I find bizarro projects such as Docco or Multivalent – both of which are as hostile and not helpful as they get. Do you REALLY expect to documentation? Blech! RTF(non-existent)M!
I even stumble across over-the-top solutions that provide you a complete web application that will index the ass off your website – Zilverline, and that’s very cool but I need a component or library I can integrate into my code and Lucene was so neat and fit that bill.

Finally, I come across PDF Box – which pretty much was it. It reads PDFs, converts them to text, and Lucene can now play with PDFs!

Now on to Word where Apache’s Jakarta POI is supposed to provide the goods for Excel, Word and even PowerPoint… am I being too optimistic?

Share
Categories
Computing

Changing CVS repository path in CVS

I am using Eclipse’s CVS module for the project I am currently working on and using the really nice cvsdude.org service for storage.

We upgraded our account to one that gave us our own cvsroot directory and hence had to change the repository path for our project in Eclipse. Eclipse 2.1, does not let you do that. You can update the user name, URL, password, but not the path. Even removing the repository did not do the job.
What did work was:

  1. In Eclipse: Right-click the root folder for the project and choose ‘Disconnect…’ from the ‘Team’ submenu’
  2. Close Eclipse
  3. Start your favorite text editor (one that worked for the procedure below was the free and fabulous jEdit)
  4. Run a global search and replace in the editor to find all instance of the old repository path string and replace them with the new repository path. The search and replace should only be run in the directory of the project. For example, my old path was /cvs/stda and I replaced it with /cvs/newpath.
  5. Start Eclipse again
  6. Right click the root folder of the project and select ‘Refresh’
  7. Right click the root folder of the project again and select ‘Share project’ from the ‘Team’ submenu.
  8. Eclipse should now tell you that the project was previously shared and that it will connect you with the repository. The repository path displayed should now be the correct one – which you updated.
Share
Categories
Computing Java

Running the Apache James Mail Server as a Windows Service

To make the Apache James mail server into a service on your Windows machine, follow these instructions (shamelessly copied from here)

From a command prompt change directory to the bin folder located under the James installation directory. From the bin folder enter the following command

Wrapper.exe -i ..\conf\wrapper.conf

This will install the James NT service. You can then control this like another service from the Windows Services screen.

To remove the James service use this command

Wrapper.exe -r ..\conf\wrapper.conf

Share
Share