Tutorials to .com

Tutorials to .com » Php » Xml » PHP and XML-based PDF document generation technology

PHP and XML-based PDF document generation technology

Print View , by: iSee ,Total views: 16 ,Word Count: 3641 ,Date: Thu, 16 Apr 2009 Time: 1:48 AM


In this paper, a brief introduction of the PHP, XML, PDF, etc. The theory, as well as their application. Trying to use the characteristics of object-oriented php to build a set of PHP and XML-based online PDF document generation system. Discussed in detail the components of the system as a whole, as well as the realization of the process of their own. And give an application in the final realization of the dynamics of this system to create examples of statements.

This article introduced the fundamentls of PHP, xml and PDF and their application situation at present, expecting to build a PHP & XML-based dynamic PDF documents creating system through the PHP's OO features. Furthermore, we discussed in detail on the components of the whole system and their respective realizing methods. Finally, we represented an example of creating reports dynamically using this system.

Key words



The rapid development in the information technology era, the Government, enterprises or individuals, how their information technology to enhance their work efficiency and save money with interest. They urgently need to find a good technology, can the traditional medium of paper documents, statements, votes documents, manuals, applications and so on and so on can be transformed into a very convenient, as well as internal network on the Internet automatically generated, dissemination, download, view, print electronic documents. Today the most popular "paperless office", "e-commerce," and so on will be based.

This document format on Adobe's PDF (Portable Document Format), which is an electronic version of the document distributed throughout the world public utility standards. As long as the installation of any browser plug-ins Acrobat reader 5.0, they are free to browse, download and print PDF documents. Undoubtedly, other electronic PDF document format has unparalleled advantages.

We know that B / S system as the most popular at present and the future of a software architecture can be very good to achieve a variety of Web browser-based applications, and PHP as a Web programming language excellent, especially for development for form user input, query the database for the browser, such as the front-end user applications. Because PHP is open source, which makes its use than other similar broader Web scripting language, its function has been to expand and improve the way of. Now the latest version of PHP can be a very good support for PDF, xml and so on. Through the system API, we can very quickly generate the PDF document, and the most attractive is that we can be PHP, query databases or XML data file generated and the results inserted into the PDF document to create a variety with no excellent view and print statements, documents, manuals, etc..

It is not difficult to see that the combination of PHP, XML, PDF three technologies, a structure can be dynamically generated PDF documents online system is a very practical significance, and its main problems:

? Documents on the network can be generated and distributed through the network. Save a lot of manpower and material resources. Print beautiful precision effects, the real achievement of the paperless office.

? E-commerce transactions in the course of a variety of instruments, certificates are available through online PHP script generation, and converted to PDF format sent to the customer.

? Enterprises in a variety of MIS system for the generation of print statements, and direct access through a browser without having to install any client, very easy to use.

? Previous file transfer is the "first printing, post-distribution", the annual cost of printing on the Government, a heavy burden on enterprises. The PDF document "circulated before and after printing" and then pieces after people can browse, and then need to print. Significantly reduce the cost of printing. Moreover, the cause is very beneficial to the environment.

2. Topic profile

In some software development projects, we encountered a very critical issue is a lot for printing statements, the generated documents. We know, HTML for browsing, but does not standardize the format suitable for printing. Therefore necessary to find a dynamically generated by PHP, and has a good effect of the document format to print. Which is the subject I have studied the needs of the most direct. Understand this point, we thought it natural that PDF as well as the PDF support PHP library PDFLib. PDFLib provided through a set of API, we can easily be found in the PHP script to create dynamic PDF documents. But this is only a very basic function, only some simple output, such as lines, text, such as rectangular boxes, and each output before an object should be designated for its coordinates. If direct access to this function to do some practical applications, such as the generation of complex statements, and its degree of difficulty is unimaginable. We can not for the creation of such a statement, and pre-calculate the coordinates of the various elements and the cell with a rectangular box out of a painting.

Therefore, the first step is to use PHP object-oriented programming approach to this basic package API, to generate a number of useful functions of an independent object modules (for example, page objects, table objects, text objects etc.). It should be said that this project is one the most basic and most important part. I am part of the use of information and

Similar foreign open-source program, on the basis of the development of a more powerful library functionality. Greatly simplifies the generation of PDF documents, in particular, is one of the table object, you can like the HTML in the same arbitrary nested TABLE tags, quick and easy realization of a variety of complex forms of mapping (which is dynamically generated on the statements are very useful).

Solving the problem of PDF generation, we face new problems, for example, the database query page will contain a large amount of information on how the result set, as well as other information to generate PDF pages? At first, we thought of ways to pass through the text file, or database query in the page data which writes a text file, and

Different types of data in the definition of a set of markers to distinguish between, PDF to generate the page to read the file, will be inserted into the PDF of the content. It is, however, will not be reliable. Because in this text file, we have adopted a specific character (or space) to separate data, if useful data is also precisely the same characters or contains spaces it? This shows that transmission of data in this way there is a hidden danger. In fact, we mentioned above in the text file with a different marker to distinguish between different types of data. And this is the idea of XML technology. Why not step forward for the use of XML as a data transmission at all? Moreover, PHP and XSLT for XML with very good support, through the expat parser, we can extract the data in XML documents can also be PHP's XSLT engine through Sablotron on the conversion of arbitrary XML documents.

First of all, by the "XML Generator" data (from database or user input, etc.) into a pre-defined in line with the DTD of the XML document, the document which describes the data content, format does not contain any information. And then by the "XML converter" will convert the XML document that contains style information display to another XML document. Finally, by the "PDF Generator" read the document, according to which the content and display the corresponding pattern to generate PDF documents. In this process, I have to do is to re-use of PHP's object-oriented features to build a reusable class: XMLWriter (generate XML documents), XMLParser (Analysis of XML documents), as well as XMLTransformer (package of XSLT functions) .

After building a successful system is a specific application. Invoicing system is mainly a variety of statements, the dynamically generated documents.

3,. Feasibility Analysis [AutoPage]

The development of a powerful, good adaptability to generate PDF documents online system, the inevitable need for flexibility, and flexibility

Development model high. Based on our PHP and XML to generate the PDF document online technology, for a variety of print-oriented applications, such as statements, single notes, manuals, etc. to provide a new way of thinking. We use PHP to query the database, handling user input, and on this basis to generate the original XML document; and then the XML document through XSLT with display layer information, to generate a new XML document. Finally, the use of "PDF Generator" will the new XML file format into the corresponding PDF document. For the first generation of XML documents, I can do a second use, because the document contains all the useful information can be very easy to deal with other applications. If we want to change the information in the PDF document shows that the style is very easy to achieve. Specialized personnel as long as the corresponding amendments to the XSL stylesheet can be a single document, no need to make any amendments to other links, with very good flexibility. In addition, PHP, XML, PDF all of which are excellent portability, you can use cross-platform. Study of the system is not plucked from the air, it needs to establish direct above. So far, this technology has practical application of inputs, has received very satisfactory results. Practice has proved that the use of PHP and XML to develop a set of online PDF document generation system has a broad and very practical applications.

4 design

The main subject of the basis of the completion of four modules of the design and programming. These four types of modules for PDFCreator, XMLWriter, XMLTransfomer and XMLParser. They are all links in the system, and has its own function and role of an independent, is the core component of the system (see chart).

System Components Chart

Can be seen from the chart, four in the system are closely linked in an organic whole. XMLWriter lose as a system

Into the interface, is responsible for generating the original XML data files. Standardize the document format (DTD) drawn up by us in advance, and XMLWriter generated in accordance with the DTD of the XML document accordingly. Then the XML document processing XMLTransfomer by, XMLTransfomer In fact, PHP is an XSLT function provided by the package, it is generally accepted by two parameters, one of which is the need to convert the XML document, and the other is the corresponding single XSL stylesheet file. XMLTransfomer is a single file under the style of the original XML document into a single style in line with the style of another XML document (including information placed in the PDF document format). Then, the new XML file to the "PDF Generator" for processing. This process is divided into two parts: First of all, XML documents necessary to carry out this analysis, which extracts the required data, this step has XMLParser to complete, XMLParser parse this XML document to be translated into an object tree , XML document is a node for each object, each object has its own attributes (that is, all the information corresponding node). In this way, we can easily visit any of the contents of XML documents. After do the XML document is read out information (including information on the format and content of information) using PDFCreator into the final output PDF documents.

5. Application Example

Here, we use the system above to create a print-oriented statements?? "Stock matter of history

Table. "The information contained in this report are: report the name (Concord Stock History Services Table), units such as built sheet date, and is extracted from the database data, the name (LLPROD), Lot (LLOC), level ( LCLS), storage (LWHS), digital library (LLOCT), the number of (LNUM), date (LDATE) and so on. XMLWriter assumptions we have used to generate the following XML document of the original (report.xml):

<? xml version = "1.0" encoding = "gb2312"?>



Stock History <title> Services Table </ title>

<unit> square meters </ unit>

<date> 20020611 </ date>

</ report_param>



<llprod> W2308 </ llprod>

<lloc> 1234 </ lloc>

<lcls> a </ lcls>

<lwhs> 01 </ lwhs>

<lloct> 0001 </ lloct>

<lnum> 200 </ lnum>

<ldate> 20020609 </ ldate>

</ record>


<llprod> W2307 </ llprod>

<lloc> 4321 </ lloc>

<lcls> a </ lcls>

<lwhs> 01 </ lwhs>

<lloct> 0001 </ lloct>

<lnum> 100 </ lnum>

<ldate> 20020609 </ ldate>

</ record>

</ report_records>

</ report>

This document contains the statements of all the useful information, we need to use a specific XSL stylesheet with a single format for their information. XMLTransformer the implementation of the code conversion is as follows:

<? php

$ xslt = new XMLTransformer ( "report.xsl", "report.xml");

$ xslt-> apply ( "pdfreport.xml");


After conversion to generate a new XML document is as follows:

<? xml version = "1.0" encoding = "gb2312"?>

<pdfreport pagetype="a4" pagesize="25" top="20" bottom="20" left="20" right="20">


<line top="5" bottom="5" size="50%" linetype="single" show="false"/>

Stock <text fontsize="30" fontlaguage="cn" align="center"> table history Affairs </ text>

<line top="5" bottom="30" size="80%" linetype="double" show="true"/>

<text fontsize="12" fontlaguage="cn" align="left"> unit: m </ text>

</ head>



<tr> <th> Name </ th> <th> lot </ th> <th> level </ th> <th> warehouse </ th> <th> digital library </ th> <th> the number of < / th> <th> Date </ th> </ tr> [AutoPage]

<tr> <td> W2308 </ td> <td> 1234 </ td> <td> a </ td> <td> 01 </ td> <td> 0001 </ td> <td> 200 </ td > <td> 20020609 </ td> </ tr>

<tr> <td> W2307 </ td> <td> 4321 </ td> <td> a </ td> <td> 01 </ td> <td> 0001 </ td> <td> 100 </ td > <td> 20020609 </ td> </ tr>

</ table>

</ body>


<line top="5" bottom="5" size="50%" linetype="single" show="false"/>

<text fontsize="12" fontlaguage="cn" align="center"> build sheet date: 20020611 </ text>

</ foot>

</ pdfreport>

XML documents used in the analysis XMLParser after a target of all of the information contained in the tree, we can easily access its contents. PDF report generated is as follows:

Program fragment as follows:

<? Include ( ".. / include / pc_init.inc");?>

<? Include ( "xmlparser.inc");


$ xmlobject = getRootNode ( "report.xml");

/ / Get the attrs of root element

$ pageSet = $ xmlobject-> attrs;

/ / Get the report-head

$ head = $ xmlobject-> nodes [0];

/ / Code ignored ...



function draw_line (& $ parent, $ line) (

$ line = & pc_create_object ($ parent, "line");

$ line-> pc_set_linestyle ($ line-> attrs [ "LINETYPE"]);

$ line-> pc_set_width ($ line-> attrs [ "SIZE"]);

$ line-> pc_set_alignment ( "center");

if ($ line-> attrs [ "SHOW"] == false) (

$ line-> pc_set_linecolor ( "white");


$ line-> pc_set_margin (array ( "top" => $ line-> attrs [ "TOP"], "bottom" => $ line-> attrs [ "BOTTOM"], "left" => 0, "right" => 0));


function draw_text (& $ parent, $ text) (

/ / Code ignored ...


function draw_table (& $ parent, $ table) (

/ / Code ignored ...


function addhead (& $ parent, $ head) (

for ($ i = 0; $ i <$ head-> n; $ i + +) (

switch ($ head-> nodes [$ i] -> name) (

case "LINE": draw_line ($ parent, $ head-> nodes [$ i]); break;

case "TEXT": draw_text ($ parent, $ head-> nodes [$ i]); break;




/ / ..



/ / Create a PDF Document

$ PDF = & pc_create_pdf (array ( "Author" => "cyman", "Title" => "a report example"));

/ / Create an A4-format page

$ Page1 = & pc_create_page ($ PDF, $ pageSet [ "PAGETYPE"]);

addhead ($ Page1, $ head);

$ PDF-> pc_draw ();


6. Summary

In a few months of graduating from the design process, although busy, but very full. Through a practical analysis of issues, research, feasibility studies, to achieve. Harvest a lot of feeling. At present, this system has been put into use, has received very satisfactory results, it is easy to make beautiful and practical statements, documents and so on. However, due to the rush of time, as well as their limited level, the system there are still many deficiencies. One of the most regrettable part is that there is no definition of a set can be of various documents (including statements, documents, manuals, etc.) are common XML tags and the procedures for the preparation of this generic XML documents into PDF, just as analytic HTML browser the same. This would not have to for a definition of documents and their XML tags corresponding to the preparation of the conversion process, can greatly improve efficiency.

Although the graduation project has come to an end, but the days ahead I will continue to study the subject.

Php XML Application Articles

Can't Find What You're Looking For?

Rating: Not yet rated


No comments posted.