Contained Within
Find More Documentation
Featured Support Resources
| Download this book in PDF (1107 KB)
Chapter 21 SOIF API
This chapter contains the functions and objects defined in the soif.h header file. It contains the following sections:
Functions and Objects
Table 21–1 provides
an alphabetized version of the functions and objects for your reference.
Table 21–1 Alphabetized Functions
and Objects Defined in the soif.h File
|
SOIF function or object
|
Category
|
|
append, increase, reset, SOIFBuffer_Create, SOIFBuffer_Free
|
Memory Buffer Management
|
|
SOIF_Apply, SOIF_Create, SOIF_Find, SOIF_Findval, SOIF_Free, SOIF_AttributeCompare, SOIF_GetAttributeSize, SOIF_GetTotalSize, SOIF_GetValueCount, SOIF_GetValueSize, SOIF_InsertAVP, SOIF_Merge, SOIF_Remove
|
SOIF Structure
|
|
SOIF_AttributeCompare, SOIF_InsertStr, SOIF_Rename, SOIF_Replace, SOIF_ReplaceMV, SOIF_ReplaceStr, SOIF_SqueezeMV, SOIFAVPair_Create, SOIFAVPair_Free
|
Attribute-Value Pair Routines
|
|
SOIF_AttributeCompareMV, SOIF_Contains, SOIF_DeleteMV, SOIF_FindvalMV, SOIF_Insert, SOIF_InsertMV, SOIF_IsMVAttribute, SOIF_MVAttributeParse, SOIFAVPair_IsMV, SOIFAVPair_NthValid, SOIFAVPair_NthValue, SOIFAVPair_NthVsize
|
Multi-valued Attribute Routines
|
|
SOIF_ParseInitFile, SOIF_ParseInitStr, SOIF_PrintInitFile, SOIF_PrintInitFn, SOIF_PrintInitStr, SOIFStream_Finish, SOIFStream_GetAllowed, SOIFStream_GetDenied, SOIFStream_IsAllowed, SOIFStream_IsEOS, SOIFStream_IsParsing, SOIFStream_IsPrinting, SOIFStream_Parse, SOIFStream_Print, SOIFStream_SetAllowed, SOIFStream_SetDenied, SOIFStream_SetFinishFn
|
Stream Routines for Parsing and Printing SOIFs
|
SOIF Structure
A SOIF has a schema-name and it associates a URL with a collection of attribute-
value pairs. The schema-name identifies how to interpret the attribute-value pairs.
SOIF supports text and binary data, and attributes can have multiple values.
An example SOIF is the following:
@DOCUMENT { http://www.siroe.com/
title{17}: Welcome to Siroe!
author{13}: Dot Punchcard
}
|
A SOIF object has URL and schema-name fields to store its URL and schema_name:
-
char *url;
-
/* The URL */
-
char *schema_name;
-
/* The Schema-Name, such as @document or @RDMHeader*/
A SOIF object contains a collection of SOIFAVPair objects,
which each contain an attribute and one or more values. To access attribute values
in a SOIF, use SOIF_find() to retrieve the AVPair for
the given attribute, or use SOIF_findval() to retrieve the value
string for a given attribute. You must use all lowercase for attribute names for find*(), since only exact attribute name lookups are supported.
You can create SOIF objects by using the SOIF_create() function.
You can also read SOIF objects from a SOIF stream.
-
SOIF_Create
-
NSAPI_PUBLIC SOIF *SOIF_Create(char *schema_name, char *url)
Creates a SOIF structure with the given schema name and URL.
-
SOIF_Free
-
NSAPI_PUBLIC void SOIF_Free(SOIF *)
Frees the given SOIF structure.
-
SOIF_GetTotalSize
-
NSAPI_PUBLIC int SOIF_GetTotalSize(SOIF *s)
Gets the estimated total size of the SOIF in bytes.
-
SOIF_GetAttributeCount
-
NSAPI_PUBLIC int SOIF_GetAttributeCount(SOIF *s)
Gets the number of attributes in the SOIF.
-
SOIF_GetAttributeSize
-
NSAPI_PUBLIC int SOIF_GetAttributeSize(SOIF *s)
Gets the size of the attributes only.
-
SOIF_GetValueSize
-
NSAPI_PUBLIC int SOIF_GetValueSize(SOIF *s)
Gets the size of the values only.
-
SOIF_GetValueCount
-
NSAPI_PUBLIC int SOIF_GetValueCount(SOIF *s)
Gets the number of values only.
-
SOIF_Merge
-
NSAPI_PUBLIC int SOIF_Merge(SOIF *dst, SOIF *src);
Use this function to merge two SOIF objects (perform a Union of their attribute-values).
It returns non-zero on error; otherwise, returns zero and the ”dst’ SOIF object contains all the attribute-value pairs from the ”src’ SOIF object.
If the ”dst’ object contains the same attribute
as ”src’, then the attribute becomes a multi-valued
attribute and all of the values are copied over to ”dst’.
Only multi-valued attributes are copied over. For single-value attributes, discard
the value in ”dst’. Currently only “classification” is a multi-valued attribute.
-
SOIF_Find
-
#define SOIF_Find(soif, attribute-name)
Retrieves the AVPair for the given attribute in the given
soif. For example, the following statement gets the AVPair for
the title attribute in the soif s:
SOIFAVpair avp=SOIF_Find(s, "title");
-
SOIF_Findval
-
#define SOIF_Findval(soif, attribute-name)
Retrieves the value string for the given attribute in the given soif. For example,
the following statement prints the value of the title attribute of the soif s:
printf("Title = %s\\n", SOIF_Findval(s, "title"));
-
SOIF_Remove
-
#define SOIF_Remove(soif, attribute-name)
Removes the given attribute from the given soif.
-
SOIF_Insert
-
#define SOIF_Insert(soif, attribute-name, value, value-size)
Inserts the given attribute and the value of the given size as an AVPair into the soif.
-
SOIF_InsertAVP
-
#define SOIF_InsertAVP(soif, avpair)
Inserts the given AVPair into the given soif.
-
SOIF_Apply
-
#define SOIF_Apply(soif, function, user-date)
Applies the given function with the given argument (user-data) to each AVPair in the given soif. For example:
void print_av(SOIF *s, SOIFAVPair *avp, void *unused)
{printf("%s = %s\\n", avp->attribute, avp->value);}
/* print every attribute and value in the soif s */
SOIF_Apply(s, print_av, NULL);
|
Attribute-Value Pair Routines
Attribute-value pairs contain an attribute and an associated value. The value
often is a simple null-terminated string; however, the value can also be binary data.
Attribute-value pairs are stored as SOIFAVPair structures.
The important fields in a SOIFAVPair structure are the following:
-
char *attribute;
-
Attribute string; ”\\0’ terminated
-
char *value;
-
Primary value; may be ”\\0’ terminated
-
size_t vsize;
-
Number of bytes (8 bits) for primary value
-
char **values;
-
Multiple values for multivalued attributes
-
size_t *vsizes;
-
The sizes for the values
-
int nvalues;
-
Number of values associated with attribute
-
int last_slot;
-
Last valid slot - array may contain holes
-
SOIFAVPair_Create
-
NSAPI_PUBLIC SOIFAVPair * SOIFAVPair_Create(char *a, char *v, int vsz);
Creates an AVPair structure with the given attribute a and
value v. The value v is a buffer of vsz bytes.
-
SOIFAVPair_Free
-
NSAPI_PUBLIC void SOIFAVPair_Free(SOIFAVPair *avp);
Frees the memory used by the given SOIFAVPair structure
-
SOIF_Replace
-
NSAPI_PUBLIC int SOIF_Replace(SOIF *s, char *att, char *val, int valsz);
Replaces the value of an existing attribute att with a new
value val of size valsz in the SOIF s.
-
SOIF_InsertStr
-
#define SOIF_InsertStr(soif, attribute, value)
Inserts the given attribute with the given value into the soif.
-
SOIF_ReplaceStr
-
#define SOIF_ReplaceStr(soif, attribute, value)
Replaces the existing value of the given attribute in the soif with the given
value.
-
SOIF_Rename
-
NSAPI_PUBLIC int SOIF_Rename(SOIF *s, char *old_attr, char *new_attr);
Renames the given attribute to the given new name.
-
SOIF_AttributeCompare
-
NSAPI_PUBLIC int SOIF_AttributeCompare(const char *a1, const char *a2);
Compares two attribute names. Returns 0 (zero) if they are equal, or non-zero
if they are different. Case (upper and lower) and trailing -s are
ignored when comparing attribute names. The following table illustrates the results
of comparing some attribute names.
|
AttibuteA
|
AttributeB
|
Does SOIF_AttributeCompare() consider them to be the same?
|
|
title
|
Title
|
yes
|
|
title
|
Title
|
yes
|
|
title
|
title
|
yes
|
|
title
|
title-page
|
no
|
|
title
|
title
|
no
|
|
author
|
title
|
no
|
Multi-valued Attribute Routines
A SOIF attribute can have multiple values. SOIF supports the convention of using -NNN to indicate a multivalued attribute. For example, Title-1, Title-2,
Title-3, and so on. The -NNN do not need to be sequential
positive integers.
The Search Engine supports searching on multi-valued attributes such as the
classification attribute. In SOIF representation, it is represented using classification-1,
classification-2, and so on. For example:
classification-1{5}: robot
classification-2{5}: siroe
classification-3{10}: web crawler
-
SOIF_AttributeCompareMV
-
NSAPI_PUBLIC int SOIF_AttributeCompareMV(const char *a1, const char *a2);
Compares two attribute names. Returns 0 (zero) if they are equal, or non-zero
if they are different. If neither of the attributes is multi-valued then use above
routine SOIF_AttributeCompare(). If one or both of the attributes
are multi-value, use the base name of the multi-valued attribute for comparison. The
base name of a multi-valued attribute is the name portion before “-”. For example, the base name
of classification-3 is classification.
-
SOIF_MVAttributeParse
-
NSAPI_PUBLIC int SOIF_MVAttributeParse(char *a)
Returns the multi-valued number of the given attribute, and strips the attribute
string of its -NNN indicator; otherwise, returns zero in the case
of a normal attribute name. For example, classification-3 returns the number 3.
-
SOIF_IsMVAttribute
-
NSAPI_PUBLIC char *SOIF_IsMVAttribute(const char *a);
Returns NULL if the given attribute is not a multi-valued attribute; otherwise
returns a pointer to where the multi-valued number occurs in the attribute string.
For example, for the multi-valued attribute classification-3, it will return the pointer
to 3.
-
SOIF_InsertMV
-
NSAPI_PUBLIC int SOIF_InsertMV(SOIF *s, char *a, int slot, char *v, int vsz, int useval)
Inserts a new value v at index slot for the given attribute a (in non-multivalue form). If set, the useval flag tells
the function to use the given value buffer rather than creating its own copy.
For example:
SOIF_InsertMV(s, "classification", 3, "web crawler", strlen("web crawler");
Inserts
classification-3{10}: web crawler
-
SOIF_ReplaceMV
-
NSAPI_PUBLIC int SOIF_ReplaceMV(SOIF *s, char *a, int slot, char *v, int vsz, int useval);
-
SOIF_DeleteMV
-
NSAPI_PUBLIC int SOIF_DeleteMV(SOIF *s, char *a, int slot)
Deletes the value at the index slot in the attribute a. For
example:
SOIF_DeleteMV(s, "classification", 3)
Deletes classification-3.
-
SOIF_FindvalMV
-
NSAPI_PUBLIC const char *SOIF_FindvalMV(SOIF *s, const char *a, int slot)
Finds the value at the index slot in the attribute a. For
example:
SOIF_FindvalMV(s, "classification", 3)
Returns web crawler (using the previous example).
-
SOIF_SqueezeMV
-
NSAPI_PUBLIC void SOIF_SqueezeMV(SOIF *s)
Forces a renumbering to ensure that the multi-value indexes are sequentially
increasing (for example, 1, 2, 3,...). This function can be used to fill in any holes
that might have occurred during SOIF_InsertMV() invocations. For
example, to insert values explicitly for the multivalue attribute author-*:
SOIF_InsertMV(s, "author", 1, "John", 4, 0);
SOIF_InsertMV(s, "author", 2, "Kevin", 5, 0);
SOIF_InsertMV(s, "author", 6, "Darren", 6, 0);
SOIF_InsertMV(s, "author", 9, "Tommy", 5, 0);
SOIF_FindvalMV(s, "author", 9); /* == "Tommy" */
SOIF_SqueezeMV(s);
SOIF_FindvalMV(s, "author", 9); /* == NULL */
SOIF_FindvalMV(s, "author", 4); /* == "Tommy" */
|
-
SOIFAVPair_IsMV
-
#define SOIFAVPair_IsMV(avp)
Use this to determine if the AVPair has multiple values or
not.
-
SOIFAVPair_NthValid
-
#define SOIFAVPair_NthValid(avp,n)
Use this to determine if the Nth value is valid or not.
-
SOIFAVPair_NthValue
-
#define SOIFAVPair_NthValue(avp,n) ((avp)->values[n])
Use this to access the Nth value. For example:
for (i = 0; i <= avp->last_slot; i++)
if (SOIFAVPair_NthValid(avp, i))
printf("%s = %s\\n", avp->attribute,
SOIFAVPair_NthValue(avp, i));
|
-
SOIFAVPair_NthVsize
-
#define SOIFAVPair_NthVsize(avp,n) ((avp)->vsizes[n])
Use this to get the size of the Nth value.
-
SOIF_Contains
-
NSAPI_PUBLIC boolean_t SOIF_Contains(SOIF *s, char *a, char *v, int vsz);
Indicates if the given attribute contains the given value. It returns B_TRUE if the value matches one or more of the values of the attribute a in the given SOIF s.
Stream Routines for Parsing and Printing SOIFs
A SOIFStream contains one or more SOIF objects.
The general approach is that you use SOIF streams to create and process streams
of many SOIF objects. Given a SOIF stream, you can parse it to get the SOIF objects
from it. Use the parse() routine to get the next SOIF object in
a SOIF stream. You can use SOIFStream_IsEOS() to check whether
the last object has been parsed.
You can use filtering functions for a SOIF stream to specify that certain SOIF
attributes are allowed or denied. If an attribute is allowed, you can parse and print
that attribute for SOIF objects in the stream. If it is denied, you cannot parse or
print that attribute of SOIF objects in the stream.
SOIF streams can be disk or memory based.
When you create a SOIFStream, you need to specify if you
will be printing or parsing the SOIF stream, and if you will be using a memory- or
disk-based stream. The functions you need to use will depend on what you will be doing
with the SOIF stream.
For creating a SOIF streams into which you will be printing SOIFS, the functions
are the following:
-
SOIF_PrintInitFile()
-
Creates a disk-based stream ready for printing.
-
SOIF_PrintInitStr()
-
Creates a memory-based stream ready for printing.
-
SOIF_PrintInitFn()
-
Creates a generic application-defined stream ready for printing. The
given ”write_fn’ is used to print the stream.
To create SOIF stream from a file or a string containing SOIF, use the following
functions:
-
SOIF_ParseInitFile()
-
Creates a disk-based stream ready for parsing. The stream is created
from an input containing SOIF syntax.
-
SOIF_ParseInitStr()
-
Creates a memory-based stream ready for parsing. The stream is created
from an input containing SOIF syntax.
SOIFStream objects have a caller-data field, which you can use as you like:
void *caller_data; /* hook to be used by caller */
Use SOIFStream_Parse() to get the SOIF objects from the SOIF
stream, and use SOIFStream_Print() to write SOIF objects to the
SOIF stream.
When you’ve finished with the stream, close it by using SOIFStream_Finish(). Use SOIFStream_SetFinishFn() to trigger the given finish_fn function.
The following example code takes a SOIF stream in stdin and
prints each SOIF in the stream to stdout. Notice that this code
uses SOIF_ParseInitFile() to create the SOIFStream to parse the
input file, and uses SOIF_PrintInitFile() to create the stream
to print the SOIFs to stdout.
SOIFStream *soifin = SOIF_ParseInitFile(stdin);
SOIFStream *soifout = SOIF_PrintInitFile(stdout);
SOIF *s;
while (!SOIFStream_IsEOS(soifin)) {
if ((s = SOIFStream_Parse(soifin)) {
SOIFStream_print(soifout, s);
SOIF_Free(s);
}
}
|
-
SOIF_PrintInitFile
-
NSAPI_PUBLIC SOIFStream *SOIF_PrintInitFile(FILE *file)
Creates a disk-based stream ready for printing.
-
SOIF_PrintInitStr
-
NSAPI_PUBLIC SOIFStream *SOIF_PrintInitStr(SOIFBuffer *memory)
Creates a memory-based stream ready for printing.
-
SOIF_PrintInitFn
-
NSAPI_PUBLIC SOIFStream *SOIF_PrintInitFn(int (*write_fn)(void *data,char *buf, int bufsz), void *data)
Creates a generic application-defined stream ready for printing. The given write_fn is used to print the stream.
This function allows you to hook up your own routine for printing.
-
SOIF_ParseInitFile
-
NSAPI_PUBLIC SOIFStream *SOIF_ParseInitFile(FILE *fp)
Creates a disk-based stream ready for parsing. The file must contain SOIF-formatted
data. The function reads SOIF data from the file object fp.
-
SOIF_ParseInitStr
-
NSAPI_PUBLIC SOIFStream *SOIF_ParseInitStr(char *buf, int bufsz)
Creates a memory-based stream ready for parsing. The character buffer must contain
SOIF-formatted data.
-
SOIFStream_Finish
-
NSAPI_PUBLIC int SOIFStream_Finish(SOIFStream *)
Closes the stream when you have finished with it.
-
SOIFStream_SetFinishFn
-
NSAPI_PUBLIC int SOIFStream_SetFinishFn(SOIFStream *, int (*finish_fn)(SOIFStream *))
Allows you to hook up a function for cleaning up after the SOIF stream finishes
its business. The finish_fn will be called when SOIFStream_Finish() has finished executing.
-
SOIFStream_Print
-
#define SOIFStream_Print(ss, s)
Prints another SOIF object to the SOIF stream ss. Returns
0 on success, or non-zero on error.
-
SOIFStream_Parse
-
#define SOIFStream_Parse(ss)
Parses and returns the next SOIF object in the SOIF stream.
-
SOIFStream_IsEOS
-
#define SOIFStream_IsEOS(s)
Returns 1 (true) if the SOIF stream has been exhausted.
-
SOIFStream_IsPrinting
-
#define SOIFStream_IsPrinting(s)
Returns 1 (true) if the SOIF has been set up in a stream by SOIF_PrintInitFile() or SOIF_PrintInitStr().
-
SOIFStream_IsParsing
-
#define SOIFStream_IsParsing(s)
Returns 1 (true) if the SOIF has been setup in a stream by SOIF_ParseInitFile() or SOIF_ParseInitStr().
Filtering SOIF Objects
To support targeted parsing and printing, you can use the attribute filtering
mechanisms in the SOIF stream. For each SOIF stream object, you can associate a list
of allowed attributes. When printing a SOIF stream, only the attributes that match
the allowed attributes will be printed. When parsing a SOIF stream, only the attributes
that match the allowed attributes will be parsed.
SOIFStream_IsAllowed() and SOIFStream_SetAllowed() allow attributes, while SOIFStream_IsDenied() and SOIFStream_SetDenied() deny attributes. You can allow or deny an attribute,
but not both.
-
SOIFStream_IsAllowed
-
NSAPI_PUBLIC boolean_t SOIFStream_IsAllowed(SOIFStream *ss, char *attribute);
Indicates that the given attribute is allowed (that is, it can be printed or
parsed).
-
SOIFStream_SetAllowed
-
NSAPI_PUBLIC int SOIFStream_SetAllowed(SOIFStream *ss, char *allowed_attrs[])
Sets all the attributes in the allowed_attrs array to allowed.
-
SOIFStream_SetDenied
-
NSAPI_PUBLIC int SOIFStream_SetDenied(SOIFStream *ss, char *denied_attrs[]);
Sets all the attributes in the allowed_attrs array to be
denied (that is, they cannot be parsed or printed).
-
SOIFStream_GetAllowed
-
NSAPI_PUBLIC char **SOIFStream_GetAllowed(SOIFStream *ss)
Returns an array of all the attributes that are allowed.
-
SOIFStream_GetDenied
-
NSAPI_PUBLIC char **SOIFStream_GetDenied(SOIFStream *ss);
Returns an array of all the attributes that are denied.
Memory Buffer Management
You can use SOIF buffers in parsing or printing routines. They take care of
memory allocation for inserting and appending. They are basically memory blocks that
are easy for SOIF routines to use.
A SOIF Buffer is represented in a SOIFBuffer structure, that
is created with the SOIFBuffer_Create() function and freed with
the SOIFBuffer-Free() function. The SOIFBuffer structure
provides the append(), increase(), and reset() functions for manipulating the data in the buffer.
-
SOIFBuffer_Create
-
NSAPI_PUBLIC SOIFBuffer *SOIFBuffer_Create(int default_sz);
The SOIFBuffer is used in SOIF_PrintInitStr(SOIFBuffer
*memory). Before you can print SOIF to memory, you need to create a buffer
for output.
-
SOIFBuffer_Free
-
NSAPI_PUBLIC void SOIFBuffer_Free(SOIFBuffer *sb);
Releases the memory buffer created by SOIFBuffer_Create().
-
append
-
void (*append)(SOIFBuffer *sb, char *data, int n)
Copies n bytes of data into the buffer.
-
increase
-
void (*increase)(SOIFBuffer *sb, int add_n)
Increases the size of the data buffer by add_n bytes.
-
reset
-
void (*reset)(SOIFBuffer *sb)
Resets the size of the data buffer and invalidates all currently valid data.
A buffer can be reused by resetting it this way.
|