ODEUM(3) Quick Database Manager ODEUM(3)NAME
Odeum - the inverted API of QDBM
SYNOPSIS
#include <depot.h>
#include <cabin.h>
#include <odeum.h>
#include <stdlib.h>
typedef struct { int id; int score; } ODPAIR;
ODEUM *odopen(const char *name, int omode);
int odclose(ODEUM *odeum);
int odput(ODEUM *odeum, const ODDOC *doc, int wmax, int over);
int odout(ODEUM *odeum, const char *uri);
int odoutbyid(ODEUM *odeum, int id);
ODDOC *odget(ODEUM *odeum, const char *uri);
ODDOC *odgetbyid(ODEUM *odeum, int id);
int odgetidbyuri(ODEUM *odeum, const char *uri);
int odcheck(ODEUM *odeum, int id);
ODPAIR *odsearch(ODEUM *odeum, const char *word, int max, int *np);
int odsearchdnum(ODEUM *odeum, const char *word);
int oditerinit(ODEUM *odeum);
ODDOC *oditernext(ODEUM *odeum);
int odsync(ODEUM *odeum);
int odoptimize(ODEUM *odeum);
char *odname(ODEUM *odeum);
double odfsiz(ODEUM *odeum);
int odbnum(ODEUM *odeum);
int odbusenum(ODEUM *odeum);
int oddnum(ODEUM *odeum);
int odwnum(ODEUM *odeum);
int odwritable(ODEUM *odeum);
int odfatalerror(ODEUM *odeum);
int odinode(ODEUM *odeum);
time_t odmtime(ODEUM *odeum);
int odmerge(const char *name, const CBLIST *elemnames);
int odremove(const char *name);
ODDOC *oddocopen(const char *uri);
void oddocclose(ODDOC *doc);
void oddocaddattr(ODDOC *doc, const char *name, const char *value);
void oddocaddword(ODDOC *doc, const char *normal, const char *asis);
int oddocid(const ODDOC *doc);
const char *oddocuri(const ODDOC *doc);
const char *oddocgetattr(const ODDOC *doc, const char *name);
const CBLIST *oddocnwords(const ODDOC *doc);
const CBLIST *oddocawords(const ODDOC *doc);
CBMAP *oddocscores(const ODDOC *doc, int max, ODEUM *odeum);
CBLIST *odbreaktext(const char *text);
char *odnormalizeword(const char *asis);
ODPAIR *odpairsand(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum,
int *np);
ODPAIR *odpairsor(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum,
int *np);
ODPAIR *odpairsnotand(ODPAIR *apairs, int anum, ODPAIR *bpairs, int
bnum, int *np);
void odpairssort(ODPAIR *pairs, int pnum);
double odlogarithm(double x);
double odvectorcosine(const int *avec, const int *bvec, int vnum);
void odsettuning(int ibnum, int idnum, int cbnum, int csiz);
void odanalyzetext(ODEUM *odeum, const char *text, CBLIST *awords,
CBLIST *nwords);
void odsetcharclass(ODEUM *odeum, const char *spacechars, const char
*delimchars, const char *gluechars);
ODPAIR *odquery(ODEUM *odeum, const char *query, int *np, CBLIST
*errors);
DESCRIPTION
Odeum is the API which handles an inverted index. An inverted index is
a data structure to retrieve a list of some documents that include one
of words which were extracted from a population of documents. It is
easy to realize a full-text search system with an inverted index.
Odeum provides an abstract data structure which consists of words and
attributes of a document. It is used when an application stores a doc‐
ument into a database and when an application retrieves some documents
from a database.
Odeum does not provide methods to extract the text from the original
data of a document. It should be implemented by applications.
Although Odeum provides utilities to extract words from a text, it is
oriented to such languages whose words are separated with space charac‐
ters as English. If an application handles such languages which need
morphological analysis or N-gram analysis as Japanese, or if an appli‐
cation perform more such rarefied analysis of natural languages as
stemming, its own analyzing method can be adopted. Result of search is
expressed as an array contains elements which are structures composed
of the ID number of documents and its score. In order to search with
two or more words, Odeum provides utilities of set operations.
Odeum is implemented, based on Curia, Cabin, and Villa. Odeum creates
a database with a directory name. Some databases of Curia and Villa
are placed in the specified directory. For example, `casket/docs',
`casket/index', and `casket/rdocs' are created in the case that a data‐
base directory named as `casket'. `docs' is a database directory of
Curia. The key of each record is the ID number of a document, and the
value is such attributes as URI. `index' is a database directory of
Curia. The key of each record is the normalized form of a word, and
the value is an array whose element is a pair of the ID number of a
document including the word and its score. `rdocs' is a database file
of Villa. The key of each record is the URI of a document, and the
value is its ID number.
In order to use Odeum, you should include `depot.h', `cabin.h',
`odeum.h' and `stdlib.h' in the source files. Usually, the following
description will be near the beginning of a source file.
#include <depot.h>
#include <cabin.h>
#include <odeum.h>
#include <stdlib.h>
A pointer to `ODEUM' is used as a database handle. A database handle
is opened with the function `odopen' and closed with `odclose'. You
should not refer directly to any member of the handle. If a fatal
error occurs in a database, any access method via the handle except
`odclose' will not work and return error status. Although a process is
allowed to use multiple database handles at the same time, handles of
the same database file should not be used.
A pointer to `ODDOC' is used as a document handle. A document handle
is opened with the function `oddocopen' and closed with `oddocclose'.
You should not refer directly to any member of the handle. A document
consists of attributes and words. Each word is expressed as a pair of
a normalized form and a appearance form.
Odeum also assign the external variable `dpecode' with the error code.
The function `dperrmsg' is used in order to get the message of the
error code.
Structures of `ODPAIR' type is used in order to handle results of
search.
typedef struct { int id; int score; } ODPAIR;
`id' specifies the ID number of a document. `score' specifies
the score calculated from the number of searching words in the
document.
The function `odopen' is used in order to get a database handle.
ODEUM *odopen(const char *name, int omode);
`name' specifies the name of a database directory. `omode'
specifies the connection mode: `OD_OWRITER' as a writer,
`OD_OREADER' as a reader. If the mode is `OD_OWRITER', the fol‐
lowing may be added by bitwise or: `OD_OCREAT', which means it
creates a new database if not exist, `OD_OTRUNC', which means it
creates a new database regardless if one exists. Both of
`OD_OREADER' and `OD_OWRITER' can be added to by bitwise or:
`OD_ONOLCK', which means it opens a database directory without
file locking, or `OD_OLCKNB', which means locking is performed
without blocking. The return value is the database handle or
`NULL' if it is not successful. While connecting as a writer,
an exclusive lock is invoked to the database directory. While
connecting as a reader, a shared lock is invoked to the database
directory. The thread blocks until the lock is achieved. If
`OD_ONOLCK' is used, the application is responsible for exclu‐
sion control.
The function `odclose' is used in order to close a database handle.
int odclose(ODEUM *odeum);
`odeum' specifies a database handle. If successful, the return
value is true, else, it is false. Because the region of a
closed handle is released, it becomes impossible to use the han‐
dle. Updating a database is assured to be written when the han‐
dle is closed. If a writer opens a database but does not close
it appropriately, the database will be broken.
The function `odput' is used in order to store a document.
int odput(ODEUM *odeum, const ODDOC *doc, int wmax, int over);
`odeum' specifies a database handle connected as a writer.
`doc' specifies a document handle. `wmax' specifies the max
number of words to be stored in the document database. If it is
negative, the number is unlimited. `over' specifies whether the
data of the duplicated document is overwritten or not. If it is
false and the URI of the document is duplicated, the function
returns as an error. If successful, the return value is true,
else, it is false.
The function `odout' is used in order to delete a document specified by
a URI.
int odout(ODEUM *odeum, const char *uri);
`odeum' specifies a database handle connected as a writer.
`uri' specifies the string of the URI of a document. If suc‐
cessful, the return value is true, else, it is false. False is
returned when no document corresponds to the specified URI.
The function `odoutbyid' is used in order to delete a document speci‐
fied by an ID number.
int odoutbyid(ODEUM *odeum, int id);
`odeum' specifies a database handle connected as a writer. `id'
specifies the ID number of a document. If successful, the
return value is true, else, it is false. False is returned when
no document corresponds to the specified ID number.
The function `odget' is used in order to retrieve a document specified
by a URI.
ODDOC *odget(ODEUM *odeum, const char *uri);
`odeum' specifies a database handle. `uri' specifies the string
of the URI of a document. If successful, the return value is
the handle of the corresponding document, else, it is `NULL'.
`NULL' is returned when no document corresponds to the specified
URI. Because the handle of the return value is opened with the
function `oddocopen', it should be closed with the function
`oddocclose'.
The function `odgetbyid' is used in order to retrieve a document by an
ID number.
ODDOC *odgetbyid(ODEUM *odeum, int id);
`odeum' specifies a database handle. `id' specifies the ID num‐
ber of a document. If successful, the return value is the han‐
dle of the corresponding document, else, it is `NULL'. `NULL'
is returned when no document corresponds to the specified ID
number. Because the handle of the return value is opened with
the function `oddocopen', it should be closed with the function
`oddocclose'.
The function `odgetidbyuri' is used in order to retrieve the ID of the
document specified by a URI.
int odgetidbyuri(ODEUM *odeum, const char *uri);
`odeum' specifies a database handle. `uri' specifies the string
the URI of a document. If successful, the return value is the
ID number of the document, else, it is -1.-1 is returned when
no document corresponds to the specified URI.
The function `odcheck' is used in order to check whether the document
specified by an ID number exists.
int odcheck(ODEUM *odeum, int id);
`odeum' specifies a database handle. `id' specifies the ID num‐
ber of a document. The return value is true if the document
exists, else, it is false.
The function `odsearch' is used in order to search the inverted index
for documents including a particular word.
ODPAIR *odsearch(ODEUM *odeum, const char *word, int max, int *np);
`odeum' specifies a database handle. `word' specifies a search‐
ing word. `max' specifies the max number of documents to be
retrieve. `np' specifies the pointer to a variable to which the
number of the elements of the return value is assigned. If suc‐
cessful, the return value is the pointer to an array, else, it
is `NULL'. Each element of the array is a pair of the ID number
and the score of a document, and sorted in descending order of
their scores. Even if no document corresponds to the specified
word, it is not error but returns an dummy array. Because the
region of the return value is allocated with the `malloc' call,
it should be released with the `free' call if it is no longer in
use. Note that each element of the array of the return value
can be data of a deleted document.
The function `odsearchnum' is used in order to get the number of docu‐
ments including a word.
int odsearchdnum(ODEUM *odeum, const char *word);
`odeum' specifies a database handle. `word' specifies a search‐
ing word. If successful, the return value is the number of doc‐
uments including the word, else, it is -1. Because this func‐
tion does not read the entity of the inverted index, it is
faster than `odsearch'.
The function `oditerinit' is used in order to initialize the iterator
of a database handle.
int oditerinit(ODEUM *odeum);
`odeum' specifies a database handle. If successful, the return
value is true, else, it is false. The iterator is used in order
to access every document stored in a database.
The function `oditernext' is used in order to get the next key of the
iterator.
ODDOC *oditernext(ODEUM *odeum);
`odeum' specifies a database handle. If successful, the return
value is the handle of the next document, else, it is `NULL'.
`NULL' is returned when no document is to be get out of the
iterator. It is possible to access every document by iteration
of calling this function. However, it is not assured if updat‐
ing the database is occurred while the iteration. Besides, the
order of this traversal access method is arbitrary, so it is not
assured that the order of string matches the one of the traver‐
sal access. Because the handle of the return value is opened
with the function `oddocopen', it should be closed with the
function `oddocclose'.
The function `odsync' is used in order to synchronize updating contents
with the files and the devices.
int odsync(ODEUM *odeum);
`odeum' specifies a database handle connected as a writer. If
successful, the return value is true, else, it is false. This
function is useful when another process uses the connected data‐
base directory.
The function `odoptimize' is used in order to optimize a database.
int odoptimize(ODEUM *odeum);
`odeum' specifies a database handle connected as a writer. If
successful, the return value is true, else, it is false. Ele‐
ments of the deleted documents in the inverted index are purged.
The function `odname' is used in order to get the name of a database.
char *odname(ODEUM *odeum);
`odeum' specifies a database handle. If successful, the return
value is the pointer to the region of the name of the database,
else, it is `NULL'. Because the region of the return value is
allocated with the `malloc' call, it should be released with the
`free' call if it is no longer in use.
The function `odfsiz' is used in order to get the total size of data‐
base files.
double odfsiz(ODEUM *odeum);
`odeum' specifies a database handle. If successful, the return
value is the total size of the database files, else, it is -1.0.
The function `odbnum' is used in order to get the total number of the
elements of the bucket arrays in the inverted index.
int odbnum(ODEUM *odeum);
`odeum' specifies a database handle. If successful, the return
value is the total number of the elements of the bucket arrays,
else, it is -1.
The function `odbusenum' is used in order to get the total number of
the used elements of the bucket arrays in the inverted index.
int odbusenum(ODEUM *odeum);
`odeum' specifies a database handle. If successful, the return
value is the total number of the used elements of the bucket
arrays, else, it is -1.
The function `oddnum' is used in order to get the number of the docu‐
ments stored in a database.
int oddnum(ODEUM *odeum);
`odeum' specifies a database handle. If successful, the return
value is the number of the documents stored in the database,
else, it is -1.
The function `odwnum' is used in order to get the number of the words
stored in a database.
int odwnum(ODEUM *odeum);
`odeum' specifies a database handle. If successful, the return
value is the number of the words stored in the database, else,
it is -1. Because of the I/O buffer, the return value may be
less than the hard number.
The function `odwritable' is used in order to check whether a database
handle is a writer or not.
int odwritable(ODEUM *odeum);
`odeum' specifies a database handle. The return value is true
if the handle is a writer, false if not.
The function `odfatalerror' is used in order to check whether a data‐
base has a fatal error or not.
int odfatalerror(ODEUM *odeum);
`odeum' specifies a database handle. The return value is true
if the database has a fatal error, false if not.
The function `odinode' is used in order to get the inode number of a
database directory.
int odinode(ODEUM *odeum);
`odeum' specifies a database handle. The return value is the
inode number of the database directory.
The function `odmtime' is used in order to get the last modified time
of a database.
time_t odmtime(ODEUM *odeum);
`odeum' specifies a database handle. The return value is the
last modified time of the database.
The function `odmerge' is used in order to merge plural database direc‐
tories.
int odmerge(const char *name, const CBLIST *elemnames);
`name' specifies the name of a database directory to create.
`elemnames' specifies a list of names of element databases. If
successful, the return value is true, else, it is false. If two
or more documents which have the same URL come in, the first one
is adopted and the others are ignored.
The function `odremove' is used in order to remove a database direc‐
tory.
int odremove(const char *name);
`name' specifies the name of a database directory. If success‐
ful, the return value is true, else, it is false. A database
directory can contain databases of other APIs of QDBM, they are
also removed by this function.
The function `oddocopen' is used in order to get a document handle.
ODDOC *oddocopen(const char *uri);
`uri' specifies the URI of a document. The return value is a
document handle. The ID number of a new document is not
defined. It is defined when the document is stored in a data‐
base.
The function `oddocclose' is used in order to close a document handle.
void oddocclose(ODDOC *doc);
`doc' specifies a document handle. Because the region of a
closed handle is released, it becomes impossible to use the han‐
dle.
The function `oddocaddattr' is used in order to add an attribute to a
document.
void oddocaddattr(ODDOC *doc, const char *name, const char *value);
`doc' specifies a document handle. `name' specifies the string
of the name of an attribute. `value' specifies the string of
the value of the attribute.
The function `oddocaddword' is used in order to add a word to a docu‐
ment.
void oddocaddword(ODDOC *doc, const char *normal, const char *asis);
`doc' specifies a document handle. `normal' specifies the
string of the normalized form of a word. Normalized forms are
treated as keys of the inverted index. If the normalized form
of a word is an empty string, the word is not reflected in the
inverted index. `asis' specifies the string of the appearance
form of the word. Appearance forms are used after the document
is retrieved by an application.
The function `oddocid' is used in order to get the ID number of a docu‐
ment.
int oddocid(const ODDOC *doc);
`doc' specifies a document handle. The return value is the ID
number of a document.
The function `oddocuri' is used in order to get the URI of a document.
const char *oddocuri(const ODDOC *doc);
`doc' specifies a document handle. The return value is the
string of the URI of a document.
The function `oddocgetattr' is used in order to get the value of an
attribute of a document.
const char *oddocgetattr(const ODDOC *doc, const char *name);
`doc' specifies a document handle. `name' specifies the string
of the name of an attribute. The return value is the string of
the value of the attribute, or `NULL' if no attribute corre‐
sponds.
The function `oddocnwords' is used in order to get the list handle con‐
tains words in normalized form of a document.
const CBLIST *oddocnwords(const ODDOC *doc);
`doc' specifies a document handle. The return value is the list
handle contains words in normalized form.
The function `oddocawords' is used in order to get the list handle con‐
tains words in appearance form of a document.
const CBLIST *oddocawords(const ODDOC *doc);
`doc' specifies a document handle. The return value is the list
handle contains words in appearance form.
The function `oddocscores' is used in order to get the map handle con‐
tains keywords in normalized form and their scores.
CBMAP *oddocscores(const ODDOC *doc, int max, ODEUM *odeum);
`doc' specifies a document handle. `max' specifies the max num‐
ber of keywords to get. `odeum' specifies a database handle
with which the IDF for weighting is calculate. If it is `NULL',
it is not used. The return value is the map handle contains
keywords and their scores. Scores are expressed as decimal
strings. Because the handle of the return value is opened with
the function `cbmapopen', it should be closed with the function
`cbmapclose' if it is no longer in use.
The function `odbreaktext' is used in order to break a text into words
in appearance form.
CBLIST *odbreaktext(const char *text);
`text' specifies the string of a text. The return value is the
list handle contains words in appearance form. Words are sepa‐
rated with space characters and such delimiters as period, comma
and so on. Because the handle of the return value is opened
with the function `cblistopen', it should be closed with the
function `cblistclose' if it is no longer in use.
The function `odnormalizeword' is used in order to make the normalized
form of a word.
char *odnormalizeword(const char *asis);
`asis' specifies the string of the appearance form of a word.
The return value is is the string of the normalized form of the
word. Alphabets of the ASCII code are unified into lower cases.
Words composed of only delimiters are treated as empty strings.
Because the region of the return value is allocated with the
`malloc' call, it should be released with the `free' call if it
is no longer in use.
The function `odpairsand' is used in order to get the common elements
of two sets of documents.
ODPAIR *odpairsand(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum,
int *np);
`apairs' specifies the pointer to the former document array.
`anum' specifies the number of the elements of the former docu‐
ment array. `bpairs' specifies the pointer to the latter docu‐
ment array. `bnum' specifies the number of the elements of the
latter document array. `np' specifies the pointer to a variable
to which the number of the elements of the return value is
assigned. The return value is the pointer to a new document
array whose elements commonly belong to the specified two sets.
Elements of the array are sorted in descending order of their
scores. Because the region of the return value is allocated
with the `malloc' call, it should be released with the `free'
call if it is no longer in use.
The function `odpairsor' is used in order to get the sum of elements of
two sets of documents.
ODPAIR *odpairsor(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum,
int *np);
`apairs' specifies the pointer to the former document array.
`anum' specifies the number of the elements of the former docu‐
ment array. `bpairs' specifies the pointer to the latter docu‐
ment array. `bnum' specifies the number of the elements of the
latter document array. `np' specifies the pointer to a variable
to which the number of the elements of the return value is
assigned. The return value is the pointer to a new document
array whose elements belong to both or either of the specified
two sets. Elements of the array are sorted in descending order
of their scores. Because the region of the return value is
allocated with the `malloc' call, it should be released with the
`free' call if it is no longer in use.
The function `odpairsnotand' is used in order to get the difference set
of documents.
ODPAIR *odpairsnotand(ODPAIR *apairs, int anum, ODPAIR *bpairs, int
bnum, int *np);
`apairs' specifies the pointer to the former document array.
`anum' specifies the number of the elements of the former docu‐
ment array. `bpairs' specifies the pointer to the latter docu‐
ment array of the sum of elements. `bnum' specifies the number
of the elements of the latter document array. `np' specifies
the pointer to a variable to which the number of the elements of
the return value is assigned. The return value is the pointer
to a new document array whose elements belong to the former set
but not to the latter set. Elements of the array are sorted in
descending order of their scores. Because the region of the
return value is allocated with the `malloc' call, it should be
released with the `free' call if it is no longer in use.
The function `odpairssort' is used in order to sort a set of documents
in descending order of scores.
void odpairssort(ODPAIR *pairs, int pnum);
`pairs' specifies the pointer to a document array. `pnum' spec‐
ifies the number of the elements of the document array.
The function `odlogarithm' is used in order to get the natural loga‐
rithm of a number.
double odlogarithm(double x);
`x' specifies a number. The return value is the natural loga‐
rithm of the number. If the number is equal to or less than
1.0, the return value is 0.0. This function is useful when an
application calculates the IDF of search results.
The function `odvectorcosine' is used in order to get the cosine of the
angle of two vectors.
double odvectorcosine(const int *avec, const int *bvec, int vnum);
`avec' specifies the pointer to one array of numbers. `bvec'
specifies the pointer to the other array of numbers. `vnum'
specifies the number of elements of each array. The return
value is the cosine of the angle of two vectors. This function
is useful when an application calculates similarity of docu‐
ments.
The function `odsettuning' is used in order to set the global tuning
parameters.
void odsettuning(int ibnum, int idnum, int cbnum, int csiz);
`ibnum' specifies the number of buckets for inverted indexes.
`idnum' specifies the division number of inverted index.
`cbnum' specifies the number of buckets for dirty buffers.
`csiz' specifies the maximum bytes to use memory for dirty buf‐
fers. The default setting is equivalent to `odsettuning(32749,
7, 262139, 8388608)'. This function should be called before
opening a handle.
The function `odanalyzetext' is used in order to break a text into
words and store appearance forms and normalized form into lists.
void odanalyzetext(ODEUM *odeum, const char *text, CBLIST *awords,
CBLIST *nwords);
`odeum' specifies a database handle. `text' specifies the
string of a text. `awords' specifies a list handle into which
appearance form is store. `nwords' specifies a list handle into
which normalized form is store. If it is `NULL', it is ignored.
Words are separated with space characters and such delimiters as
period, comma and so on.
The function `odsetcharclass' is used in order to set the classes of
characters used by `odanalyzetext'.
void odsetcharclass(ODEUM *odeum, const char *spacechars, const char
*delimchars, const char *gluechars);
`odeum' specifies a database handle. `spacechars' spacifies a
string contains space characters. `delimchars' spacifies a
string contains delimiter characters. `gluechars' spacifies a
string contains glue characters.
The function `odquery' is used in order to query a database using a
small boolean query language.
ODPAIR *odquery(ODEUM *odeum, const char *query, int *np, CBLIST
*errors);
`odeum' specifies a database handle. 'query' specifies the text
of the query. `np' specifies the pointer to a variable to which
the number of the elements of the return value is assigned.
`errors' specifies a list handle into which error messages are
stored. If it is `NULL', it is ignored. If successful, the
return value is the pointer to an array, else, it is `NULL'.
Each element of the array is a pair of the ID number and the
score of a document, and sorted in descending order of their
scores. Even if no document corresponds to the specified condi‐
tion, it is not error but returns an dummy array. Because the
region of the return value is allocated with the `malloc' call,
it should be released with the `free' call if it is no longer in
use. Note that each element of the array of the return value
can be data of a deleted document.
If QDBM was built with POSIX thread enabled, the global variable
`dpecode' is treated as thread specific data, and functions of Odeum
are reentrant. In that case, they are thread-safe as long as a handle
is not accessed by threads at the same time, on the assumption that
`errno', `malloc', and so on are thread-safe.
If QDBM was built with ZLIB enabled, records in the database for docu‐
ment attributes are compressed. In that case, the size of the database
is reduced to 30% or less. Thus, you should enable ZLIB if you use
Odeum. A database of Odeum created without ZLIB enabled is not avail‐
able on environment with ZLIB enabled, and vice versa. If ZLIB was not
enabled but LZO, LZO is used instead.
The query language of the function `odquery' is a basic language fol‐
lowing this grammar:
expr ::= subexpr ( op subexpr )*
subexpr ::= WORD
subexpr ::= LPAREN expr RPAREN
Operators are "&" (AND), "|" (OR), and "!" (NOTAND). You can use
parenthesis to group sub-expressions together in order to change order
of operations. The given query is broken up using the function `odana‐
lyzetext', so if you want to specify different text breaking rules,
then make sure that you at least set "&", "|", "!", "(", and ")" to be
delimiter characters. Consecutive words are treated as having an
implicit "&" operator between them, so "zed shaw" is actually "zed &
shaw".
The encoding of the query text should be the same with the encoding of
target documents. Moreover, each of space characters, delimiter char‐
acters, and glue characters should be single byte.
SEE ALSOqdbm(3), depot(3), curia(3), relic(3), hovel(3), cabin(3), villa(3),
ndbm(3), gdbm(3)Man Page 2004-04-22 ODEUM(3)