ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.


AddThis Social Bookmark Button

Using Lucene to Search Java Source Code
Pages: 1, 2, 3, 4, 5

Writing a JavaSourceCode Indexer

The next step is to create indexes. The important classes used to build indexes are IndexWriter, Analyzer, Document, and Field . For each source code file, a Lucene Document is created. The source code file is parsed and relevant syntactic elements of the code are extracted: import declarations, class names, classes it extends, the interface it implements, methods implemented, method parameters used, and code for each of the methods. These syntactic elements are added to distinct Fields of the Document. The Document is added to the index using the IndexWriter, which stores the indexes.

The following listing shows the JavaSourceCodeIndexer. It uses JavaParser to parse a Java file and extract syntactic elements, using the Eclipse 3.0 ASTParser. I will not go into the details of the JavaParser, as any parser could be used to extract the relevant source code elements. On extracting the elements of the source code file, a Field is created and added to the Document

import org.apache.lucene.document.*;
import org.apache.lucene.index.*;
import com.infosys.lucene.code.JavaParser.*;

public class JavaSourceCodeIndexer {

private static JavaParser parser = new JavaParser();
private static final String IMPLEMENTS = "implements";
private static final String IMPORT = "import";
public static void main(String[] args) {
File indexDir = new File("C:\\Lucene\\Java");
File dataDir = new File("C:\\JavaSourceCode ");
IndexWriter writer = new IndexWriter(indexDir,
            new JavaSourceCodeAnalyzer(), true);
indexDirectory(writer, dataDir);
public static void indexDirectory(IndexWriter writer,
                            File dir){
    File[] files = dir.listFiles();
    for (int i = 0; i < files.length; i++) {
    File f = files[i];
        // Create a Lucene Document
        Document doc = new Document();
        //  Use JavaParser to parse file
        addImportDeclarations(doc, parser);
        addComments(doc, parser);
         // Extract Class elements Using Parser
        JClass cls = parser.getDeclaredClass();
        addClass(doc, cls);
         // Add field to the Lucene Document
        doc.add(Field.UnIndexed(FILENAME, f.getName()));
private static void addClass(Document doc, JClass cls) {
    //For each class add Class Name field
    doc.add(Field.Text(CLASS, cls.className));
    String superCls = cls.superClass;
    if (superCls != null)
    //Add the class it extends as extends field
        doc.add(Field.Text(EXTENDS, superCls));
    // Add interfaces it implements
    ArrayList interfaces = cls.interfaces;
    for (int i = 0; i < interfaces.size(); i++)
        doc.add(Field.Text(IMPLEMENTS, (String) interfaces.get(i)));
    //Add details  on methods declared
    addMethods(cls, doc);
    ArrayList innerCls = cls.innerClasses;
    for (int i = 0; i < innerCls.size(); i++)
        addClass(doc, (JClass) innerCls.get(i));
private static void addMethods(JClass cls, Document doc) {
    ArrayList methods = cls.methodDeclarations;
    for (int i = 0; i < methods.size(); i++) {
        JMethod method = (JMethod) methods.get(i);
        // Add method name field
        doc.add(Field.Text(METHOD, method.methodName));
        // Add return type field
        doc.add(Field.Text(RETURN, method.returnType));
        ArrayList params = method.parameters;
        for (int k = 0; k < params.size(); k++)
        // For each method add parameter types
            doc.add(Field.Text(PARAMETER, (String)params.get(k)));
        String code = method.codeBlock;
        if (code != null)
        //add the method code block
            doc.add(Field.UnStored(CODE, code));
private static void addImportDeclarations(Document doc,
                            JavaParser parser) {
    ArrayList imports = parser.getImportDeclarations();
    if (imports == null)     return;
    for (int i = 0; i < imports.size(); i++)
    //add import declarations as keyword
        doc.add(Field.Keyword(IMPORT, (String) imports.get(i)));

Pages: 1, 2, 3, 4, 5

Next Pagearrow