Monday, September 24, 2012

Lucene - Updating index files

Apache Lucene(TM) is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
Apache Lucene is an open source project available for free download. To know more about Lucene , click here.

In this example we will see how to append to an existing lucene index. We will see how we can update the existing index files with new data.

IndexWriter iwriter = new IndexWriter(directory, analyzer, false, MaxFieldLength.UNLIMITED);

Note the boolean variable - true to create the index or overwrite the existing one; false to append to the existing index. To know more about IndexWriter, see API.

About the example: We will create few text files. Then use our java program to create index and search. Then we will add a new file, update the index and search again.

First we will create few text files in a location lets say "C:\TestLucene\files"


Javascript.txt
object
Var
function
random

SQL.txt
Select
Group by
Where
From
random

Java.txt
String
Object
ArrayList
Hashtable
Integer
Random

Now we will create a location for lucene index files lets say "C:\TestLucene\index"

To run below example please add lucene-core-3.0.2.jar in your classpath. Please see the self explanatory java code below.

Compile and run the above code

C:\TestLucene>javac -cp lucene-core-3.0.2.jar;. LuceneExample.java

C:\TestLucene>java -cp lucene-core-3.0.2.jar;. LuceneExample
Creating indexes....
C:\TestLucene\files\Java.txt
C:\TestLucene\files\Javascript.txt
C:\TestLucene\files\SQL.txt
Searching.... 'Object'
Found in :: C:\TestLucene\files\Javascript.txt
Found in :: C:\TestLucene\files\Java.txt

Now we will add a new file, lets say "C:/TestLucene/files/PHP.txt"

PHP.txt
Object
Random

Now we will create a method updateIndex() which will update the index files. Please note the IndexWriter constructor, boolean false is passed to append to the existing index.

IndexWriter iwriter = new IndexWriter(directory, analyzer, false, MaxFieldLength.UNLIMITED);


Compile and run the above code

C:\TestLucene>javac -cp lucene-core-3.0.2.jar;. LuceneExample.java

C:\TestLucene>java -cp lucene-core-3.0.2.jar;. LuceneExample
Updating indexes....
C:\TestLucene\files\PHP.txt
Searching.... 'Object'
Found in :: C:\TestLucene\files\PHP.txt
Found in :: C:\TestLucene\files\Javascript.txt
Found in :: C:\TestLucene\files\Java.txt