Doxolve Indexing : Creating a custom filter - Part 1 (Design

One of the core Windows services we developed for Doxolve is the indexing service, this service retrieves text (among other things) from the files added to your doxolve store(s).

This service secondarily (as failover) consumes the standard IFilters used by Windows Search, but primarily it will attempt to index files using doxolve specific filters, we've added some "infrastructure" empowering developers to add their own custom filters.

In this post we're going to have a look at some of the thought processes thats going into the "infrastructure", please note that the Doxolve SDK is still in its infancy, but it should give you some idea if you're planning to do some future custom development on the Doxolve platform.

The initial design for doxolve filters looked something like this:

using UIT.Doxolve.Types;
using System.ServiceProcess;

public interface IFilter
{
    String GetText();
    void SetContent(Int32 fileId, byte[] content);
    Fields GetTemplate();
    Items GetProperties();
    Item GetProperty(String id);
    void SetProperty(String id, String value);
    void OnStart(ServiceBase service);
    void OnStop(ServiceBase service);
} 

The developer needs to implement this interface in order to make their custom filter consumable by the indexing service.

Unfortunately I dont feel that this is a very practical design seeing that majority of these methods required by the interface are optional (all methods from line 8-13) not to mention the impact this design might have on future filters (a single ever growing interface to rule them all, breaking all prior custom filters).

So we're going to decouple the optional methods from the mandatory methods.

Line 6-7 are mandatory methods, we're going to place them in a base class.

public abstract class Filter
{
    public abstract String GetText();
    public abstract void SetContent(Int32 fileId, byte[] content); 
}

Observe the arguments of the SetContent method, fileId represents a reference to the database entry of the doxolved file, the content argument is a bit of a potential problem from a memory consumption perspective, it would make sense to rather send a stream object to this method, better yet a UIT.Doxolve.FileStream object (contains the fileId plus a few other useful pieces of information)

public abstract class Filter
{
    public abstract String GetText();
    public abstract void SetContent(UIT.Doxolve.FileStream fileStream); 
}

Line 8-11 are methods needed to populate searchable file properties in Doxolve, lets separate them into an interface called IFilePropertyReader.

public interface IFilePropertyReader
{
    Fields GetTemplate();
    Items GetProperties();
    Item GetProperty(String id);
    void SetProperty(String id, String value);
} 

Line 12-13 are methods that will be called during certain triggered events in the indexing service, lets separate these methods into an interface called IServiceHandler.

public interface IServiceHandler
{
    void OnStart(ServiceBase service);
    void OnStop(ServiceBase service);
} 

All of this makes implementing a custom filter a lot cleaner, if we need to create a basic filter, you will do something like this:

public class MyFilter : Filter
{
 public override String GetText()
 {
  return String.Empty;
 }
 
 public override void SetContent(UIT.Doxolve.FileStream fileStream)
 {
  // Do Something
 }
} 

If we're going to interact with the service, we simply need to implement the IServiceHandler. (If we've got a critical error in one of our custom filters this will give us the ability to stop the service from our filter)

public class MyFilter : Filter, IServiceHandler
{
 public override String GetText()
 {
  return String.Empty;
 }
 
 public override void SetContent(UIT.Doxolve.FileStream fileStream){}
 public void OnStart(ServiceBase service) {}
    public void OnStop(ServiceBase service) {}
} 

If the filter is going to return file properties, we implement the IFilePropertyReader interface: (file properties e.g. MP3's got genre, seconds, artists etc)

public class MyFilter : Filter, IFilePropertyReader
{
 public override String GetText()
 {
  return String.Empty;
 }
 
 public override void SetContent(UIT.Doxolve.FileStream fileStream) {}
    public Fields GetTemplate() {}
    public Items GetProperties() {}
    public Item GetProperty(String id) {}
    public void SetProperty(String id, String value) {}
 
} 

In the next part of this post we're going to have a look at how to create a simple filter that reads text files.

10 Aug 2011, 19:35 PM


Comments (0)


1   -  Page 1 of 1
 

DOXOLVE FEATURES

Doxolve is a total electronic document and data management solution. Full text indexing (Search for content inside any document), Integrates with Windows XP, 2000, Vista, 2003, 2008 and Windows 7, Integrates with MS Office 2003, 2007 and Office 2010, the structure is totally dynamic ...

 

 

 
 

DOXOLVE SUPPORT

Please contact us with your support query and a consultant will attend to your problem within 24 hours.

 

Regards

 

Doxolve support team

 
 

DOXOLVE CONTACT US

Location Pretoria, South Africa
Tel +27 12 345 6172
Postal PO BOX 39945
  Moreletapark 0044