JavaHelp
JavaHelpTM 1.0 - Content Search


Copyright 1998-1999 Sun Microsystems


Search API

JavaHelp provides full-text searching of help topics. Development of full-text searching raised interesting questions, both in the implementation and in the specification. For example, whether the search database is created before or during queries, and how the format of the search database is specified.

The search API javax.help.search.* can be used to create and query the search database. The default NavigatorView, SearchView knows how to interact with any subclass of SearchEngine. Similarly the search database can be created through the IndexBuilder class.

One of the benefits of the javax.help.search API is that it enables the use of search engines that require moderatedly complex database formats without the difficult and constraining task of specifying these formats in full. One such search engine is the one provided in Sun's reference implementation.

The intention of the javax.help.search package is to provide insulation between client and customers of a full-text search database in the context of the javax.help package. It is important to emphasize that although the javax.help.search API is intented to be of general applicability, it is not intented to be a replacement for more powerful query mechanisms.

Search Database Creation

Search databases are created through instances of IndexBuilder. The parsing of each file is specific to its MIME content; this is encoded in the notion of an IndexerKit. An indexer kit provides a parse() method that knows how to parse the specific MIME type and call back into the IndexBuilder instance to capture the information of this source.

When capturing search information there are a number of parameters that you can configure using a ConfigFile:

Stopwords

You can direct the JavaHelp system full-text search indexer to exclude certain words from the database index--these words are called stopwords. The default stopwords are:
a	all	am	an	and	any	are	as
at	be	but	by	can	could	did	do
does	etc	for	from	goes	got	had	has
have	he	her	him	his	how   	if	in
is	it	let	me	more	much	must	my
nor	not	now	of	off	on	or	our
own	see 	set	shall	she	should	so	some
than	that	the	them	then	there	these   this
those   though	to	too	us	was	way	we
what	when	where	which   who	why	will	would
yes	yet	you

ConfigFile Directives

A config file may contain the following directives

Directive Description
IndexRemove path Remove a path that is a prefix to the given files
IndexPrepend path Prepend path to the names of the given files.
File filename Request that the filename be processed
StopWords word, word, word... Set the stopwords to the ones indicated
StopWordsFile filename StopWordsFile must contain a list of stopwords, one stopword per line.

Search Database Use

The javax.help.search package is used in JavaHelp 1.0 by SearchView. This view expects an engine property that specifies the name of the subclass of javax.help.search.SearchEngine to use when making queries. The default value of this property is com.sun.java.help.search.SearchEngine.

The steps involved in using the search engine from a SearchView are:

More details may be added in the next iteration of the specification.


JavaHelpTM 1.0
Send your comments to javahelp-comments@eng.sun.com
Last modified: Mon Apr 12 16:46:00 MDT 1999