WordDict(3)WordDict(3)NAMEWordDict-
manage and use an inverted index dictionary.
SYNOPSIS
#include <mifluz.h>
WordList* words = ...;
WordDict* dict = words->Dict();
DESCRIPTIONWordDict maps strings to unique identifiers and frequency in the
inverted index. Whenever a new word is found, the WordDict class can be
asked to assign it a serial number. When doing so, an entry is created
in the dictionary with a frequency of zero. The application may then
increment or decrement the frequency to reflect the inverted index con‐
tent.
The serial numbers range from 1 to 2^32 inclusive.
A WordDict object is automatically created by the WordList object and
should not be created directly by the application.
METHODSWordDict()
Private constructor.
int Initialize(WordList* words)
Bind the object a WordList inverted index. Return OK on success,
NOTOK otherwise.
int Open()
Open the underlying Berkeley DB sub-database. The enclosing file
is given by the words data member. Return OK on success, NOTOK
otherwise.
int Remove()
Destroy the underlying Berkeley DB sub-database. Return OK on
success, NOTOK otherwise.
int Close()
Close the underlying Berkeley DB sub-database. Return OK on suc‐
cess, NOTOK otherwise.
int Serial(const String& word, unsigned int& serial)
If the word argument exists in the dictionnary, return its
serial number in the serial argument. If it does not already
exists, assign it a serial number, create an entry with a fre‐
quency of zero and return the new serial in the serial argument.
Return OK on success, NOTOK otherwise.
int SerialExists(const String& word, unsigned int& serial)
If the word argument exists in the dictionnary, return its
serial number in the serial argument. If it does not exists set
the serial argument to WORD_DICT_SERIAL_INVALID. Return OK on
success, NOTOK otherwise.
int SerialRef(const String& word, unsigned int& serial)
Short hand for Serial() followed by Ref(). Return OK on suc‐
cess, NOTOK otherwise.
int Noccurrence(const String& word, unsigned int& noccurrence) const
Return the frequency of the word argument in the noccurrence
argument. Return OK on success, NOTOK otherwise.
int Normalize(String& word) const
Short hand for words->GetContext()->GetType()->Normalize(word).
Return OK on success, NOTOK otherwise.
int Ref(const String& word)
Short hand for Incr(word, 1)
int Incr(const String& word, unsigned int incr)
Add incr to the frequency of the word Return OK on success,
NOTOK otherwise.
int Unref(const String& word)
Short hand for Decr(word, 1)
int Decr(const String& word, unsigned int decr)
Subtract decr to the frequency of the word the frequency becomes
lower or equal to zero, remove the entry from the dictionnary
and lose the association between the word and its serial number.
Return OK on success, NOTOK otherwise.
int Put(const String& word, unsigned int noccurrence)
Set the frequency of word with the value of the noccurrence
argument.
int Exists(const String& word) const
Return true if word exists in the dictionnary, false otherwise.
List* Words() const
Return a pointer to the associated WordList object.
WordDictCursor* Cursor() const
Return a cursor to sequentially walk the dictionnary using the
Next method.
int Next(WordDictCursor* cursor, String& word, WordDictRecord& record)
Return the next entry in the dictionnary. The cursor argument
must have been created using the Cursor method. The word is
returned in the word argument and the record is returned in the
record argument. On success the function returns 0, at the end
of the dictionnary it returns DB_NOTFOUND. The cursor argument
is deallocated when the function hits the end of the dictionnary
or an error occurs.
WordDictCursor* CursorPrefix(const String& prefix) const
Return a cursor to sequentially walk the entries of the diction‐
nary that start with the prefix argument, using the NextPrefix
method.
int NextPrefix(WordDictCursor* cursor, String& word, WordDictRecord&
record)
Return the next prefix from the dictionnary. The cursor argument
must have been created using the CursorPrefix method. The word
is returned in the word argument and the record is returned in
the record argument. The word is guaranteed to start with the
prefix specified to the CursorPrefix method. On success the
function returns 0, at the end of the dictionnary it returns
DB_NOTFOUND. The cursor argument is deallocated when the func‐
tion hits the end of the dictionnary or an error occurs.
int Write(FILE* f)
Dump the complete dictionary in the file descriptor f. The for‐
mat of the dictionary is word serial frequency , one by line.
AUTHORS
Loic Dachary loic@gnu.org
The Ht://Dig group http://dev.htdig.org/
SEE ALSOhtdb_dump(1), htdb_stat(1), htdb_load(1), mifluzdump(1), mifluzload(1),
mifluzsearch(1), mifluzdict(1), WordContext(3), WordList(3), WordLis‐
tOne(3), WordKey(3), WordKeyInfo(3), WordType(3), WordDBInfo(3), Wor‐
dRecordInfo(3), WordRecord(3), WordReference(3), WordCursor(3), Word‐
CursorOne(3), WordMonitor(3), Configuration(3), mifluz(3)
local WordDict(3)