Dashboard > ActiveRecord > ... > How to > Using NHibernate.Search with ActiveRecord
Using NHibernate.Search with ActiveRecord
Added by David Moore, last edited by David Moore on Aug 07, 2008  (view change)
Labels: 
(None)


What this guide will help you with:

  • Take one of your ActiveRecord models and mark up fields in it you want to index
  • Create an index of your the model
  • Search the index and get the results

What this guide won't help you with:

  • Tuning the performance of the indexing and searching (or even taking them into consideration)
  • Automating the creation and synchronizing of the index
  • Integrating NHibernate.Search with ActiveRecord gracefully
  • Making coffee

Prerequisites:

  • A knowledge of Subversion, and building Castle and other projects from source using tools such as Nant
  • Your web application / project already set up using Castle, including 1 or more ActiveRecord models we can try indexing and searching
  • A grasp of NHibernate, ActiveRecord etc

ActiveRecord provides an excellent way for managing your database entities. When it comes to searching for and acquiring these entities in flexible ways, ActiveRecord can do the trick in many cases, using various criteria.

However, when it comes to complex or more intelligent searches, particularly full-text, then the demand arises for a component that can index and analyse the data you want to search on in powerful ways.

This is where Lucene comes in (a free open source Apache project):

"Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform."

Lucene has been ported to many languages and platforms, including a .NET implementation called Lucene.Net. It's used on many many websites and applications, including Wikipedia (http://wiki.apache.org/lucene-java/PoweredBy).

The Hibernate.Search project integrates Lucene with Hibernate:

"Full text search engines like Apache Luceneā„¢ are a very powerful technology to bring free text/efficient queries to applications. If suffers several mismatches when dealing with a object domain model (keeping the index up to date, mismatch between the index structure and the domain model, querying mismatch...) Hibernate Search indexes your domain model thanks to a few annotations, takes care of the database / index synchronization and brings you back regular managed objects from free text queries. Hibernate Search is using Apache Lucene under the cover"

This all sounds great right?

The NHibernate.Search work was recently done, so now we want to get this working with ActiveRecord so we can manage, index and search our entities easily.

What you need

  • You need to be working from the Castle Project Trunk and be able to build it. This is because we need to be using NHibernate 2.x. You really should be using the trunk anyway, as it will have the latest bug fixes and newest features, and many people are using the trunk for production websites.

References and Documentation

The following links should help you as you work more with NHibernate.Search, i.e. getting familiar with Lucene and Lucene queries, how Lucene is integrated through Hibernate.Search, etc:

Lucene: http://lucene.apache.org/
Lucene.Net: http://incubator.apache.org/lucene.net/
Hibernate.Search documentation (For the Java version, but has a very similar API): http://www.hibernate.org/hib_docs/search/reference/en/html/
NHibernate documentation: http://www.hibernate.org/hib_docs/nhibernate/1.2/reference/en/html/

Getting and building NHibernate.Search

NHibernate.Search is located under /src/NHibernate.Search, but you will also need the /lib dir from the trunk for Lucene.Net.dll

  • Build using Nant (or alternatively, add the project to your solution)

Update your project so that it references the NHibernate (from Castle trunk), NHibernate.Search and Lucene.Net (both from the NHibernate.Search build) assemblies.

Now we want to

  • Configure NHibernate.Search
  • Place attributes on one of our models to determine how it gets indexed
  • Create our index
  • Search our index

Configure NHibernate

Here's what I have:

<facility id="arintegration" type="Castle.Facilities.ActiveRecordIntegration.ActiveRecordFacility,Castle.Facilities.ActiveRecordIntegration" isWeb="true">
    <assemblies>
        <item>MyAssembly</item>
    </assemblies>
    <config>
        <add key="cache.use_query_cache"        value="false" />
        <add key="show_sql"                     value="true" />
        <add key="dialect"                      value="NHibernate.Dialect.MsSql2000Dialect" />
        <add key="connection.driver_class"      value="NHibernate.Driver.SqlClientDriver" />
        <add key="connection.connection_string" value="#{connectionString}" />
        <add key="connection.provider"          value="NHibernate.Connection.DriverConnectionProvider" />
        <add key="hibernate.search.default.directory_provider" value="NHibernate.Search.Storage.FSDirectoryProvider, NHibernate.Search" />
        <add key="hibernate.search.default.indexBase"          value="~/Index"/>
        <add key="hibernate.search.analyzer" value="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net"/>
    </config>
</facility>

~/Index is where all our index files and folders will be created. This will resolve to the Index directory under our application root. Make sure the ASPNET process has modify permissions on this directory. I won't keep this Index directory there and I recommend you don't either for security reasons, but we'll use this for now.

Initialize Search

We must hook NHibernate.Search into NHibernate so that it can automatically maintain the search index whenever our searchable models are saved, updated or deleted. NHibernate.Search uses the new event listener features from NHibernate2 to achieve this (previous versions using the Interceptor model).

The best time and place to do this is at application startup when you're also intializing things like ActiveRecord or Windsor Container. Here's a simple example for a web application, in the HttpApplication class (typically called GlobalApplication.cs):

using Castle.ActiveRecord;
using Castle.ActiveRecord.Framework;
using NHibernate.Event;
using NHibernate.Search.Event;

...

public void Application_OnStart()
{
    ActiveRecordStarter.Initialize();

    ISessionFactoryHolder holder = ActiveRecordMediator.GetSessionFactoryHolder();
    NHibernate.Cfg.Configuration configuration = holder.GetConfiguration(typeof(ActiveRecordBase));
    
    configuration.SetListener( ListenerType.PostUpdate, new FullTextIndexEventListener() );
    configuration.SetListener( ListenerType.PostInsert, new FullTextIndexEventListener() );
    configuration.SetListener( ListenerType.PostDelete, new FullTextIndexEventListener() );
}

As you can see, we initialized ActiveRecord first, then we obtained the NHibernate configuration and set up the event listeners to update the index whenever a model is updated, inserted or deleted.

Marking up our Model

Open up an ActiveRecord entity:

using ActiveRecord;
using NHibernate.Search.Attributes;
using FieldAttribute=NHibernate.Search.Attributes.FieldAttribute; // To prevent clash with Castle.ActiveRecord.FieldAttribute
----

[ActiveRecord]
[Indexed] // This tells NHibernate.Search this entity should be indexed
public class MyEntity { // Note, it's not required to inherit from ActiveRecordBase or its derivatives, e.g. if you're using Repositories

/// <summary>
/// The unique ID
/// </summary>
[PrimaryKey]
[DocumentId] // So Lucene knows this is our primary key.
public int EntityId {
    get { return commercialId; }
    set { commercialId = value; }
}

/// <summary>
/// The Title
/// </summary>
[Property]
[Field(Index.Tokenized)] // Let's index our title, and tokenize it so we can do keyword searches
public string Title {
    get { return title; }
    set { title = value; }
}

OK that's all for now to keep it very simple! You can consult the Hibernate.Search documentation for more on this.

Create our Index

OK now we should be able to create an initial Index of all our entities which we can later search on. This is simply our starting point, and when any entities are saved, updated or deleted, our index should automatically be updated.

Here is a generic method you can use to create an index for an ActiveRecord entity (warning: this will delete any existing files or directories in the index directory first!). Also there is a method for optimizing the index, which you can do after creation or at certain times for maintenance reasons.

// Example usage:
IndexHelper.CreateIndex<MyEntity>("D:\\Index");
IndexHelper.OptimizeIndex<MyEntity>();
/// <summary>
/// Creates an index for all entities of a specified type
/// </summary>
/// <param name="rootIndexDirectory">The root directory where indexes will be stored under.</param>
public static void CreateIndex<T>(string rootIndexDirectory)
{
    Type type = typeof(T);

    var info = new DirectoryInfo( Path.Combine( rootIndexDirectory, type.Name ) );

    // Recursively delete the index and files in there
    if( info.Exists ) info.Delete( true );

    // Now recreate the index
    FSDirectory dir =
        FSDirectory.GetDirectory(
            Ioc.UrlProvider.MapPath( Path.Combine( rootIndexDirectory, type.Name ) ), true );

    try
    {
        var writer = new IndexWriter( dir, new StandardAnalyzer(), true );
        writer.Close();
    }
    finally
    {
        if( dir != null) dir.Close();
    }

    ISession session = ActiveRecordMediator.GetSessionFactoryHolder().CreateSession( type );
    IFullTextSession fullTextSession = NHibernate.Search.Search.CreateFullTextSession( session );
    foreach (T instance in ActiveRecordBase<T>.FindAll())
    {
        fullTextSession.Index(instance);
    }            
}

/// <summary>
/// Optimizes the index for the specified type
/// </summary>
public static void OptimizeIndex<T>()
{
    var nhibernateConfiguration =
        ActiveRecordMediator.GetSessionFactoryHolder().GetConfiguration(typeof(ActiveRecordBase));

    using(var workspace = new Workspace( SearchFactoryImpl.GetSearchFactory( nhibernateConfiguration )))
    {
        IndexWriter writer = workspace.GetIndexWriter( typeof(T) );
        writer.Optimize();
    }
}

The CreateIndex method should create an Index/MyEntity folder with .cfs files, segments file, maybe a deletable file too. Make sure ASPNET or whatever process you're running under has access to the Index folder to be able to create, modify and delete files and folders.

Verifying our index

You can use the Luke Lucene toolbox to open up your Index directory or directories to see what's in the Index and run some test queries. This is an excellent way of verifying the Index is correct and has the expected content in there, and see why your queries aren't working. http://www.getopt.org/luke/

The beauty of Lucene and all incarnations is that the format is standardized, so you can create a Lucene index with the Java version, read it with the PHP port of Lucene, and modify it with the Lucene.Net version. Hence why Luke works (and presumably any other Lucene tools you can find).

You should make sure you can get data into your index and search it easily using luke before progressing. An example search for our Index/MyEntity folder would be "Title:Test" which would find any entities with the word "Test" in their titles. Remember that the property names are case sensitive.

Searching our index

By now you should have verified that your index files have been created and are working as expected.

Time to run a search!

I added 2 new methods to my entity: one for running a simple search, and a utility method for parsing the search terms into a LuceneQuery we can use.

Sorry but I'm not entirely sure of required namespaces as I'm cutting and pasting code; hopefully you have ReSharper installed

using Lucene.Net.Analysis.Standard;
using Lucene.Net.QueryParsers;

public static IList SearchSimple(string searchString)
{
    ISession session = holder.CreateSession(typeof(MyEntity));

    // Create a Full Text session
    IFullTextSession fullTextSession = NHibernate.Search.Search.CreateFullTextSession(session);

    // Build our Lucene query
    Query luceneQuery = ParseLuceneQuery( searchString );

    // Transform the Lucene query to an NHibernate query,
    // and limit the result set types to MyEntity
    IQuery query = fullTextSession.CreateFullTextQuery(luceneQuery );

    // List our results
    return query.List();
}

public static Query ParseLuceneQuery(string searchTerms)
{
    StringBuilder queryString = new StringBuilder();

    // Split the search string into keywords
    string[] words = searchTerms.Split(" ".ToCharArray());

    foreach (string keyword in words)
    {
        if (!String.IsNullOrEmpty(keyword))
        {
            queryString.AppendFormat(" Title:{0}", keyword);
        }
    }

    QueryParser parser = new QueryParser(queryString.ToString(), new StandardAnalyzer());

    return parser.Parse(queryString.ToString());
}

Say for example we enter in a search string as "Testing Search". This will create a Lucene query of "Title:Testing Title:Search" which should try to match any MyEntity that has Testing or Search in its title.

The matched search results (if any) will be returned in a list of MyEntity objects.

LuceneQueryExpression

You can also use lucene searches with NHibernate expressions using LuceneExpression e.g.

// MyBuildQuery is a method you would have to create to parse the keywords into a lucene query, similarly to the ParseLuceneMethod above)
IList<MyEntity> entities = ActiveRecordMediator<MyEntity>.FindAll( new LuceneQueryExpression( MyBuildQuery("my keywords here") )  );

Site running on a free Atlassian Confluence Community License granted to Castle Project. Evaluate Confluence today.
Powered by Atlassian Confluence, the Enterprise Wiki. (Version: 2.5.4 Build:#809 Jun 12, 2007) - Bug/feature request - Contact Administrators