Tutorials to .com

Tutorials to .com » Website » Xml » MSXML's DOM model to deal with XML

MSXML's DOM model to deal with XML

Print View , by: iSee ,Total views: 11 ,Word Count: 2327 ,Date: Sat, 18 Jul 2009 Time: 9:56 PM

xml documents using MSXML to deal with early on has come into contact with, and also wrote a story called "XML documents considering" the essay, inadvertently used the article about google search and found a lot of sites now being reproduced, the copyrightto see not ashamed to have been good. Fact, it is not specifically to google shamelessly, in the mblogger.cn link to view the blog and found that the search for more articles from google. And now, back to read this article and coldah! In explaining the shortcomings of previous inside or DOM to take the time to first talk about the main structure to facilitate understanding of the back. If we do not understand how to start using MSXML, see the article just mentioned, is still somewhat used, address: http://ms.mblogger.cn/ohahu/posts/4563.aspx

DOM model in MSXML libraries for the main performance of the xml document into memory, to form a IXMLDOMDocument, then each component of which are together with a corresponding interface. As yet been XSLT with MSXML formatted xml document, so it had no choice but related to the mention. First come, which have the basic components of XML text:

(1)<?xml version='1.0' encoding='GB2312'?>
(2)<?xml-stylesheet type='text/xsl' href='/expert/Xsl/2.xsl'?>
(3)<body>
(4)  <code1 id=”text”></code>
(5)  <code2 id=”cdata”>
(6)  <![CDATA[
(7)  CDATA,'<'XML(8)  ]]>
(9)  </code>
(10)</body>
(1)<?xml version='1.0' encoding='GB2312'?>
(2)<?xml-stylesheet type='text/xsl' href='/expert/Xsl/2.xsl'?>
(3)<body>
(4)  <code1 id=”text”></code>
(5)  <code2 id=”cdata”>
(6)  <![CDATA[
(7)  CDATA,'<'XML(8)  ]]>
(9)  </code>
(10)</body>

In order to facilitate the formulation, are placed in the first column of the line number. First, from (1) ~ (10) enough to become a Document, the corresponding IXMLDOMDocument in the MSXML interface; (1) line, (2) line corresponds to the interface in MSXML to IXMLDOMProcessingInstruction; (3) line to (10) line is a Root Element, the corresponding interface IXMLDOMElement, (3) which contains a number of <code> is element, but a body under the floor of the element. Root should be noted that there can be only one, Root, no matter how deep below, in theory, allow the countless number of (?) And allowed to repeat the name; (4) and (5) which have an id = "?" This is a attribute (Note: (1) and (2) line version; encoding; type; href are also), corresponding to MSXML's IXMLDOMAttribute; (4) inside the "text" looks and (7) of this line is almost all text information, a lot of people think that can be adopted get_text () directly from (I also thought that not too long ago), it is wrong. The "text" is by (4) of the element directly get_text () to obtain, corresponding to IXMLDOMText, but if in (5) of the element directly get_text () will go wrong, why? CDATA is different from the "text" of another kind of type, corresponds to IXMLDOMCDATASection, how to obtain the back again. XML is the basic framework of the whole seems to come out:

<IXMLDOMDocument>
<IXMLDOMProcessingInstruction />
<IXMLDOMProcessingInstruction />
<IXMLDOMElement (,)>
<IXMLDOMElement IXMLDOMAttribute>
IXMLDOMText
</ IXMLDOMElement >
<IXMLDOMElement IXMLDOMAttribute>
IXMLDOMCDATASection
</ IXMLDOMElement >
<IXMLDOMElement>
</IXMLDOMDocument>

However, take a look at a lot of tutorial on using MSXML another interface IXMLDOMNode, how this matter, and above What is the relationship between IXMLDOMElement. For such a long time spent in front of MSXML is often confused in this place, and no progress, the latest in c + + Builder5 the following analysis the use of XML, but there is no TXMLDocument control package you want to use MSXML and found some of them the mystery. In the DOM model, including Document, ProcessingInstruction, Attribute, Element, TextNode, CDATASection are seen as a Node, interface in MSXML time to achieve the performance targets for these are derived from Node to Node, and this understanding the relationship between the number of interfaces the key is to look at the interface between the relations derived. Why do you want to add a Node interface it? XML is designed to enable the link between the various elements, not loose.

In addition, two other interfaces IXMLDOMNodeList and IXMLDOMNamedNodeMap. IXMLDOMNamedNodeMap which the main one used in connection with the Element of the Attribute, as the Element's Attribute with a need to ensure that the conflict can not be 22, and is used in connection IXMLDOMNodeList several other elements of ProcessingInstruction, Element, TextNode, CDATASection and so on, and these elements such as the Element, in the same layer is allowed to repeat the same name. The link above does not include the Document of the Node, because XML is the Document of the first Node, can not be allowed to appear in the following IXMLDOMNodeList and IXMLDOMNamedNodeMap. I say this is the code after it had been confirmed! Document of the MSXML as the first Node, and then through the layer NamedNodeMap calendar for all the Attribute, Recursion NodeList cycle through all of the next layer of Node. Can see that such a tree structure (of which the attribute are listed in brackets in the line, where the indentation that layer):

#document
xml (version;encoding)
xml-stylesheet (type; href)
body
code1 (id)
#text
code2 (id)
#cdata-section

Print out from the above information can be seen TextNode, CDATASection is Element as the next layer of Node to handle the matter. get_text () only for the convenience of using a shortcut TextNode of CDATASection through get_text () visit will go wrong, they may consider the adoption of the next layer of the first Node of the Node to obtain. Methods:

MSXML::IXMLDOMNodePtr child;
child = parent->childNodes->get_item(0);
if (child != NULL)
text = child->get_nodeValue();
MSXML::IXMLDOMNodePtr child;
child = parent->childNodes->get_item(0);
if (child != NULL)
text = child->get_nodeValue();

In the recursive cycle time, usually we can not predict what this type of Node. In order to know the Element is a Node is TextNode, or other type, can IXMLDOMNode:: nodeType to obtain, this is an enumerated type, the following values (this also can be seen above, the XML text and XML are not covered all the elements):

NODE_ELEMENT (1)
NODE_ATTRIBUTE (2)
NODE_TEXT (3)
NODE_CDATA_SECTION (4)
NODE_ENTITY_REFERENCE (5)
NODE_ENTITY (6)
NODE_PROCESSING_INSTRUCTION (7)
NODE_COMMENT (8)
NODE_DOCUMENT (9)
NODE_DOCUMENT_TYPE (10)
NODE_DOCUMENT_FRAGMENT (11)
NODE_NOTATION (12)

Well, the following should be able to "XML documents considering" a number of issues which arise out of a list it:

Issue: the biggest problem, through the path to find more in-depth of a node.

In this article, through recursive function calls to achieve this function, is not absolutely necessary, as the superfluous. DOM model in their own zone IXMLDOMNode:: SelectSingleNode and IXMLDOMNode:: SelectNodes (XPath) the realization of the recursive calls than perfect features. SelectSingleNode talk about simple to use (msdom is an example IXMLDOMDocument)

MSXML::IXMLDOMNodePtr parent, child;
parent = msdom->documentElement;
//childcode1Element
child = parent->selectSingleNode(“/code1”);
//childcode1idAttribute
child = parent->selectSingleNode(“/code1@id”);
//childcode1Attribute id='text'ElementElement
child = parent->selectSingleNode(“/code1@[id='text']”);
MSXML::IXMLDOMNodePtr parent, child;
parent = msdom->documentElement;
//childcode1Element
child = parent->selectSingleNode(“/code1”);
//childcode1idAttribute
child = parent->selectSingleNode(“/code1@id”);
//childcode1Attribute id='text'ElementElement
child = parent->selectSingleNode(“/code1@[id='text']”);
MSXML::IXMLDOMNodePtr parent, child;
parent = msdom->documentElement;
//childcode1Element
child = parent->selectSingleNode(“/code1”);
//childcode1idAttribute
child = parent->selectSingleNode(“/code1@id”);
//childcode1Attribute id='text'ElementElement
child = parent->selectSingleNode(“/code1@[id='text']”);
MSXML::IXMLDOMNodePtr parent, child;
parent = msdom->documentElement;
//childcode1Element
child = parent->selectSingleNode(“/code1”);
//childcode1idAttribute
child = parent->selectSingleNode(“/code1@id”);
//childcode1Attribute id='text'ElementElement
child = parent->selectSingleNode(“/code1@[id='text']”);

... Other points of usage to be higher after the study related to XPath.

Reference Address: http://sqq876.blogchina.com/2486119.html

Second problem: C++ + + Builder6 inside TXMLDocument of MSXML is not a simple package

In order to C + + Builder 5 following a similar package to the controls, looking for some information and found that MSXML, OpenXML parser such as DOM model are the same set of interfaces (I hope I am not mistaken), but different internal. TXMLDocument can be set by setting Ventor use a different parser, but in C + + Builder which is to use exactly the same. It seems the default is to use the MSXML analytical, comparative advantages and disadvantages, MSXML to be registered on the client msxml.dll new library; SAX needs greater attached dll; OpenXML because it is a direct use. Pas file compiler, can be directly generated by implementation document.

Three: In the article, the use of ergodic IEnum interface NodeList

Knife to kill a chicken with a bit of suspicion, the traversal can be:

for (int i = 0; i < nodelist->get_length(); i++)
{
child = nodelist->get_item((long)i);
name = child->get_nodeName();
}

Think about a time when the original so this method should be used, but then Node and Element do not know the relationship between the indiscriminate implementation of the following conversion of the conversion so that the element == NUL, thought.

IXMLDOMElementPtr element = (IXMLDOMNodePtr) node;

Three: to the later period of up operation appendChild, createElement from the Element will be directly converted into Node and then appendChild. In fact, this should be one of the most basic C + + knowledge, the key is to see the MSXML inside Com also achieved with the C + + of such skills.

Now come take a look at C + + Builder 5 which is not how to solve TXMLDocument control, it is necessary to use the MSXML library in what ways.

The first thought is to use the import statement of the method, that is,

# import "C: Windowssystem32MSXML.DLL" named_guids

However, import into and tlh generated tlb file, but can not compile, with a total lack of what prompted or there are some function has not come out (on the can vc + + inside import, and then copy the generated tlb and tlh to C + + Builder project). Thus, the direct use of C + + Builder which TVariant own instantiated COM classes, call the function, property (OleFunction, OlePropertyGet, OlePropertySet) and so on. For example:

TVariant varMSDOM = CreateOleObject ( "MSXML.DOMDocument");

varMSDOM.OleFunction (L "load", L "c: tmp.xml");

TVariant varDoc = varMSDOM.OlePropertyGet ( "documentElement");

Intuitive way to this call will come in the call than import slowly. Difference seems to be import directly through virtual function table to find function pointer to call; OleFunction these IDispatch interface through the invoke function, an indirect call, but can not import into name_guids call those functions, such as get_nodeName (BSTR *). In any case, the adoption of the basic TVariant or to meet the requirements of the call function.

Then, suddenly discovered that C + + Builder have the project on Import From Type Library menu and try. tlb files and generates a tlh, adding the project to compile what can be passed, but the definition of tlh inside and call the method and the VC but it came a bit different from the import:

1. VC instantiated inside the direct use of the smart pointer, such as

MSXML:: IXMLDOMDocumentPtr msdom;

msdom.CreateInstance (__uuidof (MSXML:: DOMDocument));

C + + Builder instantiated inside through another object of the compiler package to achieve

TCOMIXMLDOMDocument i_xmldocument = CoDOMDocument:: Create ();

IXMLDOMDocumentPtr msdom = (IXMLDOMDocumentPtr) i_xmldocument;

The definition of TCOMIXMLDOMDocument which can be found in MSXML2_TLB.h

typedef TComInterface <IXMLDOMDocument> TCOMIXMLDOMDocument;

2. C + + Builder which also IXMLDOMDocumentPtr msdom, but the pointer can not be directly used to determine whether equivalent to NULL, the compiler will prompt error and should be replaced by

if ((IXMLDOMDocument *) msdom == NULL)

Above in C + + Builder following the experience of the use of MSXML, extended to other types of COM automation (automation), it should be no problem! Is not it?


XML Tutorial Articles


Can't Find What You're Looking For?


Rating: Not yet rated

Comments

No comments posted.