Comments ·
Contents
·
Previous
·
Next
Getting Started with Mini-XML
This chapter describes how to write programs that use Mini-XML to
access data in an XML file. Mini-XML provides the following
functionality:
- Functions for creating and managing XML documents in memory.
- Reading of UTF-8 and UTF-16 encoded XML files and strings.
- Writing of UTF-8 encoded XML files and strings.
- Support for arbitrary element names, attributes, and attribute
values with no preset limits, just available memory.
- Support for integer, real, opaque ("CDATA"), and text data types in
"leaf" nodes.
- "Find", "index", and "walk" functions for easily accessing data in
an XML document.
Mini-XML doesn't do validation or other types of processing on the
data based upon schema files or other sources of definition
information, nor does it support character entities other than those
required by the XML specification.
Mini-XML provides a single header file which you include:
#include <mxml.h>
The Mini-XML library is included with your program using the
-lmxml option:
gcc -o myprogram myprogram.c -lmxml ENTER
If you have the pkg-config(1) software installed, you can
use it to determine the proper compiler and linker options for your
installation:
pkg-config --cflags mxml ENTER
pkg-config --libs mxml ENTER
Every piece of information in an XML file is stored in memory in
"nodes". Nodes are defined by the
mxml_node_t structure. Each node has a typed value, optional
user data, a parent node, sibling nodes (previous and next), and
potentially child nodes.
For example, if you have an XML file like the following:
<?xml version="1.0" encoding="utf-8"?>
<data>
<node>val1</node>
<node>val2</node>
<node>val3</node>
<group>
<node>val4</node>
<node>val5</node>
<node>val6</node>
</group>
<node>val7</node>
<node>val8</node>
</data>
the node tree for the file would look like the following in memory:
?xml version="1.0" encoding="utf-8"?
|
data
|
node - node - node - group - node - node
| | | | | |
val1 val2 val3 | val7 val8
|
node - node - node
| | |
val4 val5 val6
where "-" is a pointer to the sibling node and "|" is a pointer to
the first child or parent node.
The mxmlGetType
function gets the type of a node, one of MXML_CUSTOM,
MXML_ELEMENT, MXML_INTEGER, MXML_OPAQUE,
MXML_REAL, or MXML_TEXT. The parent and sibling nodes are
accessed using the
mxmlGetParent, mxmlGetNext,
and mxmlGetPrevious functions.
The mxmlGetUserData
function gets any user data associated with the node.
CDATA (MXML_ELEMENT) nodes are created using the
mxmlNewCDATA function. The
mxmlGetCDATA function retrieves the CDATA string pointer
for a node.
Note:
CDATA nodes are currently stored in memory as special elements. This
will be changed in a future major release of Mini-XML.
Custom (MXML_CUSTOM) nodes are created using the
mxmlNewCustom function or using a custom load callback
specified using the
mxmlSetCustomHandlers function. The
mxmlGetCustom function retrieves the custom value pointer
for a node.
Comment (MXML_ELEMENT) nodes are created using the
mxmlNewElement function. The
mxmlGetElement function retrieves the comment string
pointer for a node, including the surrounding "!--" and "--"
characters.
Note:
Comment nodes are currently stored in memory as special elements.
This will be changed in a future major release of Mini-XML.
Element (MXML_ELEMENT) nodes are created using the
mxmlNewElement function. The
mxmlGetElement function retrieves the element name, the
mxmlElementGetAttr function retrieves the value string for
a named attribute associated with the element, and the
mxmlGetFirstChild and
mxmlGetLastChild functions retrieve the first and last
child nodes for the element, respectively.
Integer (MXML_INTEGER) nodes are created using the
mxmlNewInteger function. The
mxmlGetInteger function retrieves the integer value for a
node.
Opaque (MXML_OPAQUE) nodes are created using the
mxmlNewOpaque function. The
mxmlGetOpaque function retrieves the opaque string pointer
for a node. Opaque nodes are like string nodes but preserve all
whitespace between nodes.
Text (MXML_TEXT) nodes are created using the
mxmlNewText and
mxmlNewTextf functions. Each text node consists of a text
string and (leading) whitespace value - the
mxmlGetText function retrieves the text string pointer and
whitespace value for a node.
Processing instruction (MXML_ELEMENT) nodes are created
using the mxmlNewElement
function. The
mxmlGetElement function retrieves the processing instruction
string for a node, including the surrounding "?" characters.
Note:
Processing instruction nodes are currently stored in memory as
special elements. This will be changed in a future major release of
Mini-XML.
Real number (MXML_REAL) nodes are created using the
mxmlNewReal function. The
mxmlGetReal function retrieves the CDATA string pointer for
a node.
XML declaration (MXML_ELEMENT) nodes are created using the mxmlNewXML function. The mxmlGetElement
function retrieves the XML declaration string for a node, including the
surrounding "?" characters.
Note:
XML declaration nodes are currently stored in memory as special
elements. This will be changed in a future major release of Mini-XML.
You can create and update XML documents in memory using the various
mxmlNew functions. The following code will create the XML document
described in the previous section:
mxml_node_t *xml; /* <?xml ... ?> */
mxml_node_t *data; /* <data> */
mxml_node_t *node; /* <node> */
mxml_node_t *group; /* <group> */
xml = mxmlNewXML("1.0");
data = mxmlNewElement(xml, "data");
node = mxmlNewElement(data, "node");
mxmlNewText(node, 0, "val1");
node = mxmlNewElement(data, "node");
mxmlNewText(node, 0, "val2");
node = mxmlNewElement(data, "node");
mxmlNewText(node, 0, "val3");
group = mxmlNewElement(data, "group");
node = mxmlNewElement(group, "node");
mxmlNewText(node, 0, "val4");
node = mxmlNewElement(group, "node");
mxmlNewText(node, 0, "val5");
node = mxmlNewElement(group, "node");
mxmlNewText(node, 0, "val6");
node = mxmlNewElement(data, "node");
mxmlNewText(node, 0, "val7");
node = mxmlNewElement(data, "node");
mxmlNewText(node, 0, "val8");
We start by creating the declaration node common to all XML files
using the mxmlNewXML
function:
xml = mxmlNewXML("1.0");
We then create the <data> node used for this document using
the mxmlNewElement
function. The first argument specifies the parent node (xml)
while the second specifies the element name (data):
data = mxmlNewElement(xml, "data");
Each <node>...</node> in the file is created using the
mxmlNewElement and
mxmlNewText functions. The first argument of mxmlNewText
specifies the parent node (node). The second argument
specifies whether whitespace appears before the text - 0 or false in
this case. The last argument specifies the actual text to add:
node = mxmlNewElement(data, "node");
mxmlNewText(node, 0, "val1");
The resulting in-memory XML document can then be saved or processed
just like one loaded from disk or a string.
You load an XML file using the
mxmlLoadFile function:
FILE *fp;
mxml_node_t *tree;
fp = fopen("filename.xml", "r");
tree = mxmlLoadFile(NULL, fp,
MXML_TEXT_CALLBACK);
fclose(fp);
The first argument specifies an existing XML parent node, if any.
Normally you will pass NULL for this argument unless you are
combining multiple XML sources. The XML file must contain a complete
XML document including the ?xml element if the parent node is
NULL.
The second argument specifies the stdio file to read from, as opened
by fopen() or popen(). You can also use stdin
if you are implementing an XML filter program.
The third argument specifies a callback function which returns the
value type of the immediate children for a new element node:
MXML_CUSTOM, MXML_IGNORE, MXML_INTEGER,
MXML_OPAQUE, MXML_REAL, or MXML_TEXT. Load
callbacks are described in detail in
Chapter 3. The example code uses the MXML_TEXT_CALLBACK
constant which specifies that all data nodes in the document contain
whitespace-separated text values. Other standard callbacks include
MXML_IGNORE_CALLBACK, MXML_INTEGER_CALLBACK,
MXML_OPAQUE_CALLBACK, and MXML_REAL_CALLBACK.
The mxmlLoadString
function loads XML node trees from a string:
char buffer[8192];
mxml_node_t *tree;
...
tree = mxmlLoadString(NULL, buffer,
MXML_TEXT_CALLBACK);
The first and third arguments are the same as used for
mxmlLoadFile(). The second argument specifies the string or
character buffer to load and must be a complete XML document including
the ?xml element if the parent node is NULL.
You save an XML file using the
mxmlSaveFile function:
FILE *fp;
mxml_node_t *tree;
fp = fopen("filename.xml", "w");
mxmlSaveFile(tree, fp, MXML_NO_CALLBACK);
fclose(fp);
The first argument is the XML node tree to save. It should normally
be a pointer to the top-level ?xml node in your XML document.
The second argument is the stdio file to write to, as opened by
fopen() or popen(). You can also use stdout if
you are implementing an XML filter program.
The third argument is the whitespace callback to use when saving the
file. Whitespace callbacks are covered in detail in
Chapter 3. The previous example code uses the MXML_NO_CALLBACK
constant to specify that no special whitespace handling is required.
The
mxmlSaveAllocString, and
mxmlSaveString functions save XML node trees to strings:
char buffer[8192];
char *ptr;
mxml_node_t *tree;
...
mxmlSaveString(tree, buffer, sizeof(buffer),
MXML_NO_CALLBACK);
...
ptr = mxmlSaveAllocString(tree, MXML_NO_CALLBACK);
The first and last arguments are the same as used for
mxmlSaveFile(). The mxmlSaveString function takes pointer
and size arguments for saving the XML document to a fixed-size buffer,
while mxmlSaveAllocString() returns a string buffer that was
allocated using malloc().
When saving XML documents, Mini-XML normally wraps output lines at
column 75 so that the text is readable in terminal windows. The
mxmlSetWrapMargin function overrides the default wrap
margin:
/* Set the margin to 132 columns */
mxmlSetWrapMargin(132);
/* Disable wrapping */
mxmlSetWrapMargin(0);
Once you are done with the XML data, use the
mxmlDelete function to recursively free the memory that is
used for a particular node or the entire tree:
mxmlDelete(tree);
You can also use reference counting to manage memory usage. The
mxmlRetain and
mxmlRelease functions increment and decrement a node's use
count, respectively. When the use count goes to 0, mxmlRelease
will automatically call mxmlDelete to actually free the memory
used by the node tree. New nodes automatically start with a use count
of 1.
The mxmlWalkPrev
and mxmlWalkNext
functions can be used to iterate through the XML node tree:
mxml_node_t *node;
node = mxmlWalkPrev(current, tree,
MXML_DESCEND);
node = mxmlWalkNext(current, tree,
MXML_DESCEND);
In addition, you can find a named element/node using the
mxmlFindElement function:
mxml_node_t *node;
node = mxmlFindElement(tree, tree, "name",
"attr", "value",
MXML_DESCEND);
The name, attr, and value arguments can be
passed as NULL to act as wildcards, e.g.:
/* Find the first "a" element */
node = mxmlFindElement(tree, tree, "a",
NULL, NULL,
MXML_DESCEND);
/* Find the first "a" element with "href"
attribute */
node = mxmlFindElement(tree, tree, "a",
"href", NULL,
MXML_DESCEND);
/* Find the first "a" element with "href"
to a URL */
node = mxmlFindElement(tree, tree, "a",
"href",
"http://www.easysw.com/",
MXML_DESCEND);
/* Find the first element with a "src"
attribute */
node = mxmlFindElement(tree, tree, NULL,
"src", NULL,
MXML_DESCEND);
/* Find the first element with a "src"
= "foo.jpg" */
node = mxmlFindElement(tree, tree, NULL,
"src", "foo.jpg",
MXML_DESCEND);
You can also iterate with the same function:
mxml_node_t *node;
for (node = mxmlFindElement(tree, tree,
"name",
NULL, NULL,
MXML_DESCEND);
node != NULL;
node = mxmlFindElement(node, tree,
"name",
NULL, NULL,
MXML_DESCEND))
{
... do something ...
}
The MXML_DESCEND argument can actually be one of three
constants:
- MXML_NO_DESCEND means to not to look at any child nodes in
the element hierarchy, just look at siblings at the same level or
parent nodes until the top node or top-of-tree is reached.
The previous node from "group" would be the "node" element to the
left, while the next node from "group" would be the "node" element to
the right.
- MXML_DESCEND_FIRST means that it is OK to descend to the
first child of a node, but not to descend further when searching.
You'll normally use this when iterating through direct children of a
parent node, e.g. all of the "node" and "group" elements under the
"?xml" parent node in the example above.
This mode is only applicable to the search function; the walk
functions treat this as MXML_DESCEND since every call is a
first time.
- MXML_DESCEND means to keep descending until you hit the
bottom of the tree. The previous node from "group" would be the "val3"
node and the next node would be the first node element under "group".
If you were to walk from the root node "?xml" to the end of the tree
with mxmlWalkNext(), the order would be:
?xml data node val1 node val2 node val3 group node val4 node val5
node val6 node val7 node val8
If you started at "val8" and walked using mxmlWalkPrev(),
the order would be reversed, ending at "?xml".
You can find specific nodes in the tree using the
mxmlFindPath, for example:
mxml_node_t *value;
value = mxmlFindPath(tree, "path/to/*/foo/bar");
The second argument is a "path" to the parent node. Each component of
the path is separated by a slash (/) and represents a named element in
the document tree or a wildcard (*) path representing 0 or more
intervening nodes.
Comments ·
Contents
·
Previous
·
Next
Add Comment
You have 5 moderation points available.
No comments for this page.
|