Token List Indexes

Token List Indexes are designed for the quick searching of full text queries. There are two types of token list indexes, first is the ASCII (8 bit characters) token list and second is the wide character (16 bits) token list.

The token list structure is like the Spatial Index as more than one matching result can be found by searching the structure, so a temporary index is created before any reads from the database can take places. The Token List in structure is very similar to a variable B+-Tree index that has buckets to store the many matching ObjectIDs per token.

This Token List index first breaks up the defined index fields into each separate token that it contains, for example the string "the lazy brown fox" would be broken up into four tokens:- the; lazy; brown; fox, each of these token would then be indexed and the ObjectID added to the matching bucket. Then the temporary index is created, the search only has to cross check each unique token and not the full text of each field in each object.

The Token List searches provides several different mechanisms for the search, firstly there are the Boolean logical operators: - and; or; xor; not; and (). Secondly there are token operators: - if a token is to contain a string, "A" or A; if a token is to start with a string " A"; if a token is to end with a string "A "; and lastly, is a token is to match a given string " A ". Where more advance search logic is required you are to implement this yourself, for example if you wanted a search "A near B", you would have to search "A and B" then read each result to cross check that A was in fact near B. This is because the Token List index does not provide this sort of searching abilities and the implementation would be the same if done within ObjectDatabase++ or not.