文件类型

2021-07-29 22:33:25 作者：互联网

XML的全称是EXtensible Markup Language,可扩展标记语言.

可以用作XML用途

XML的文档结构

1.第一行必须是XML声明

<?xml version="1.0" encoding="UTF-8"?>
version 代表版本号1.0/1.1
encoding UTF-8设置字符集，用于支持中文

2.有且只有一个根节点

3.XML标签的书写规则与XML相同

特殊字符处理

标签体中，出现>,<,、特殊字符，会破坏文档结构

解决方案1 ：使用实体引用。

解决方案 2：使用CDATA标签。

CDATA 指的是不应由 XML 解析器进行解析的文本数据

从"<![CDATA[" 开始，到"]]>"结束

<lesson>
    <content>
　　<![CDATA[

   　　　　 本节我们来学习html中 标签的使用：
       　　 <body>
           　　 <a href="index.html">首页</a>
       　　 </body>
　　　　]]>

    </content>
</lesson>

有序的子元素

XML语义约束

1.XML文档结构正确，但可能不是有效的

2.XML语义约束有两种定义方式DTD与XML Schema

DTD

文档类型定义（document type definition）是一种简单易用的语义约束方式.

DTD定义节点

<!ELEMENT>标签

1.定义emp节点下只允许出现 1个子节点

<!ELEMENT hr (employee)>    只允许出现 1个子节点

<!ELEMENT hr (employee+)>   最少出现一个子节点

<!ELEMENT hr (employee)*>    出现0到n个子节点

<!ELEMENT hr (employee)？>   最多出现一个子节点

2.employee节点下必须包含一下四个节点，且按顺序出现.

<!ELEMENT employee (name,age,salary,department)>

3.定义name标签体只能是文本，#PCDATA代表文本元素。

<!ELEMENT name (#PCDATA)>

XML引用DTD文件

.<!DOCTYPE 根节点 SYSTEM "dtd文件路径">

<!DOCTYPE emp SYSTEM "a.dtd">

练习：

emp.xml

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE emp SYSTEM "a.dtd">
<emp>
    <employee num="666">
        <name>li</name>
        <age>18</age>
        <salary>8000</salary>
        <department>
            <dname>人事</dname>
            <address>xx大厦</address>
        </department>
    </employee>
    <employee num="777">
        <name>liu</name>
        <age>22</age>
        <salary>6000</salary>
        <department>
            <dname>财务</dname>
            <address>xx大厦</address>
        </department>
    </employee>
    <employee num="888">
        <name>hou</name>
        <age>33</age>
        <salary>1500</salary>
        <department>
            <dname>后勤</dname>
            <address>xx大厦</address>
        </department>
    </employee>

</emp>

a.dtd

<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT emp (employee+)>
<!ELEMENT employee (name,age,salary,department)>
<!ATTLIST  employee num CDATA "">
<!ELEMENT name (#PCDATA)>
<!ELEMENT age (#PCDATA)>
<!ELEMENT salary (#PCDATA)>
<!ELEMENT department (dname,address)>
<!ELEMENT dname (#PCDATA)>
<!ELEMENT address (#PCDATA)>

XML Schema

XML Schema比DTD更为复杂，提供了更多功能

XML Schema 提供了数据类型、格式限定、数据范围等特性

XML Schema 是W3C标准

<?xml version="1.0" encoding="UTF-8"?>
<!-- 人力资源管理系统 -->
<hr xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:noNamespaceSchemaLocation="hr.xsd">
    <employee no="3309">
        <name>张三</name>
        <age>31</age>
        <salary>4000</salary>
        <department>
            <dname>会计部</dname>
            <address>XX大厦-B103</address>
        </department>
    </employee>
    <employee no="3310">
        <name>李四</name>
        <age>23</age>
        <salary>3000</salary>
        <department>
            <dname>工程部</dname>
            <address>XX大厦-B104</address>
        </department>
    </employee>
</hr>

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
    <element name="hr">
        <!-- complexType标签含义是复杂节点，包含子节点时必须使用这个标签 -->
        <complexType>
            <sequence>
                <element name="employee" minOccurs="1" maxOccurs="9999">
                    <complexType>
                        <sequence>
                            <element name="name" type="string"></element>
                            <element name="age">
                                <simpleType>
                                    <restriction base="integer">
                                        <minInclusive value="18"></minInclusive>
                                        <maxInclusive value="60"></maxInclusive>
                                    </restriction>
                                </simpleType>
                            </element>
                            <element name="salary" type="integer"></element>
                            <element name="department">
                                <complexType>
                                    <sequence>
                                        <element name="dname" type="string"></element>
                                        <element name="address" type="string"></element>
                                    </sequence>
                                </complexType>
                            </element>
                        </sequence>
                        <attribute name="no" type="string" use="required"></attribute>                    
                    </complexType>
                </element>
            </sequence>
        </complexType>
    </element>    
</schema>

XML解析

Dom4j

Dom4j是一个易用的、开源的库，用于解析XML。它应用于java平台，具有性能优异、功能强大和极其易使用的特点。

Dom4j将XML视为Document对象

XML标签被Dom4j定义为Element对象

遍历XML

import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;

import java.io.File;
import java.util.List;

public class EmpReader {
    public void readXml() {

        String file="E:\\emp.xml";
        File file1=new File(file);
        SAXReader reader=new SAXReader();
        try {
            Document document=reader.read(file1);
            //获取XMML文档根节点
            Element root=document.getRootElement();
            List<Element> employees= root.elements("employee");
            for (Element employee : employees) {
                Element name=employee.element("name");
                String empName=name.getText();
                System.out.println(employee.elementText("salary"));
                Element department=employee.element("department");
                System.out.println(department.elementText("dname"));
                System.out.println(department.elementText("address"));
                Attribute att=employee.attribute("num");
                System.out.println(att.getText());

            }
        } catch (DocumentException e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        EmpReader empReader=new EmpReader();
        empReader.readXml();
    }
}

更新XML

import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;

import java.io.*;
import java.util.List;

public class EmpWrite {
    public void writeXml() {
        String file="E:\\IdeaProjects\\web基础\\com.lr\\src\\emp.xml";
        SAXReader reader=new SAXReader();
        try {
            Document document=reader.read(file);
            //获取XMML文档根节点
            Element root=document.getRootElement();
            Element employee=root.addElement("employee");
            employee.addAttribute("num","555");
            Element name=employee.addElement("name");
            name.setText("zxx");
            employee.addElement("salary").setText("3600");
            Element department=employee.addElement("department");
            department.addElement("dename").setText("销售");
            department.addElement("address").setText("xxx大厦");
            Writer writer=new OutputStreamWriter(new FileOutputStream(file),"UTF-8");
            document.write(writer);
            writer.close();

        } catch (DocumentException | FileNotFoundException | UnsupportedEncodingException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        EmpWrite empWrite=new EmpWrite();
        empWrite.writeXml();
    }
}

XPath路径表达式

XPath路径表达式是XML 文档中查找数据的语言。

掌握XPath可以极大的提高在提取数据时的开发效率。

学习XPath本质就是掌握各种形式表达式的使用技巧。

XPath基本表达式

import java.io.File;
import java.util.List;

import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.Node;
import org.dom4j.io.SAXReader;

public class XPathTestor {
    public void xpath(String xpathExp){
        String file = "E:\\hr.xml";
        File file1=new File(file);
        SAXReader reader = new SAXReader();
        try {
            Document document = reader.read(file1);

            List<Node> nodes = document.selectNodes(xpathExp);
            for(Node node : nodes){
                Element emp = (Element)node;
                System.out.println(emp.attributeValue("no"));
                System.out.println(emp.elementText("name"));
                System.out.println(emp.elementText("age"));
                System.out.println(emp.elementText("salary"));
                System.out.println("==============================");
            }


        } catch (DocumentException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        XPathTestor testor = new XPathTestor();
//        testor.xpath("/hr/employee");
        testor.xpath("//employee");
        testor.xpath("//employee[salary<4000]");
        testor.xpath("//employee[name='李铁柱']");
        testor.xpath("//employee[@no=3304]");
        testor.xpath("//employee[1]");
        testor.xpath("//employee[last()]");
        testor.xpath("//employee[position()<3]");
        testor.xpath("//employee[3] | //employee[8]");

    }
}

标签：XML,dom4j,import,Element,org,employee,文件类型
来源： https://www.cnblogs.com/liurui12138/p/15077418.html