org.apache.cocoon.components.search
Interface LuceneXMLIndexer

All Superinterfaces:
org.apache.avalon.framework.component.Component
All Known Implementing Classes:
SimpleLuceneXMLIndexerImpl

public interface LuceneXMLIndexer
extends org.apache.avalon.framework.component.Component

The avalon behavioural component interface of generating lucene documents from an xml content.

The well-known fields of a lucene documents are defined as *_FIELD constants.

You may access generated lucene documents via allDocuments(), or iterator().

You trigger the generating of lucene documents via build().

Version:
CVS $Id: LuceneXMLIndexer.java,v 1.5 2002/02/22 07:00:12 cziegeler Exp $
Author:
Bernhard Huber

Field Summary
static java.lang.String BODY_FIELD
          A Lucene document field name, containing xml content text of all xml elements.
static java.lang.String ROLE
          The ROLE name of this avalon component.
static java.lang.String UID_FIELD
          A Lucene document field name, containg the a unique key of the indexed document.
static java.lang.String URL_FIELD
          A Lucene document field name, containg the URI/URL of the indexed document.
 
Method Summary
 java.util.List build(java.net.URL url)
          Build lucene documents from a URL.
 

Field Detail

ROLE

public static final java.lang.String ROLE
The ROLE name of this avalon component.

Its value if the FQN of this interface, ie. org.apache.cocoon.components.search.LuceneXMLIndexer.

Since:
 

BODY_FIELD

public static final java.lang.String BODY_FIELD
A Lucene document field name, containing xml content text of all xml elements.

A concrete implementation of this interface SHOULD provides a field named body.

A concrete implementation MAY provide additional lucene document fields.

Since:
 

URL_FIELD

public static final java.lang.String URL_FIELD
A Lucene document field name, containg the URI/URL of the indexed document.

A concrete implementation of this interface SHOULD provide a field named url.

Since:
 

UID_FIELD

public static final java.lang.String UID_FIELD
A Lucene document field name, containg the a unique key of the indexed document.

This document field is used internally to track document changes, and updates.

A concrete implementation of this interface SHOULD provide a field named uid.

Since:
 
Method Detail

build

public java.util.List build(java.net.URL url)
                     throws ProcessingException
Build lucene documents from a URL.

This method will read the content of the URL, and generates one or more lucene documents. The generated lucence documents can be fetched using methods allDocuments(), and iterator().

Parameters:
url - the content of this url gets indexed.
Throws:
ProcessingException - Description of Exception
Since:
 


Copyright © 1999-2002 Apache Software Foundation. All Rights Reserved.