首页 > 其他分享> > 如何从XML文件中删除多余的空行？

如何从XML文件中删除多余的空行？

2019-09-30 10:01:52 作者：互联网

简而言之;我在XML文件中生成了许多空行,我正在寻找一种方法来删除它们作为一种倾斜文件的方式.我怎样才能做到这一点？

详细说明;我目前有这个XML文件：

<recent>
  <paths>
    <path>path1</path>
    <path>path2</path>
    <path>path3</path>
    <path>path4</path>
  </paths>
</recent>

我使用这个Java代码删除所有标签,并添加新标签：

public void savePaths( String recentFilePath ) {
    ArrayList<String> newPaths = getNewRecentPaths();
    Document recentDomObject = getXMLFile( recentFilePath );  // Get the <recent> element.
    NodeList pathNodes = recentDomObject.getElementsByTagName( "path" );   // Get all <path> nodes.

    //1. Remove all old path nodes :
        for ( int i = pathNodes.getLength() - 1; i >= 0; i-- ) { 
            Element pathNode = (Element)pathNodes.item( i );
            pathNode.getParentNode().removeChild( pathNode );
        }

    //2. Save all new paths :
        Element pathsElement = (Element)recentDomObject.getElementsByTagName( "paths" ).item( 0 );   // Get the first <paths> node.

        for( String newPath: newPaths ) {
            Element newPathElement = recentDomObject.createElement( "path" );
            newPathElement.setTextContent( newPath );
            pathsElement.appendChild( newPathElement );
        }

    //3. Save the XML changes :
        saveXMLFile( recentFilePath, recentDomObject ); 
}

多次执行此方法后,我得到一个具有正确结果的XML文件,但在“paths”标记之后和第一个“path”标记之前有许多空行,如下所示：

<recent>
  <paths>





    <path>path5</path>
    <path>path6</path>
    <path>path7</path>
  </paths>
</recent>

谁知道如何解决这个问题？

——————————————-编辑：添加getXMLFile(. ..),saveXMLFile(…)代码.

public Document getXMLFile( String filePath ) { 
    File xmlFile = new File( filePath );

    try {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        Document domObject = db.parse( xmlFile );
        domObject.getDocumentElement().normalize();

        return domObject;
    } catch (Exception e) {
        e.printStackTrace();
    }

    return null;
}

public void saveXMLFile( String filePath, Document domObject ) {
    File xmlOutputFile = null;
    FileOutputStream fos = null;

    try {
        xmlOutputFile = new File( filePath );
        fos = new FileOutputStream( xmlOutputFile );
        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        Transformer transformer = transformerFactory.newTransformer();
        transformer.setOutputProperty( OutputKeys.INDENT, "yes" );
        transformer.setOutputProperty( "{http://xml.apache.org/xslt}indent-amount", "2" );
        DOMSource xmlSource = new DOMSource( domObject );
        StreamResult xmlResult = new StreamResult( fos );
        transformer.transform( xmlSource, xmlResult );  // Save the XML file.
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (TransformerConfigurationException e) {
        e.printStackTrace();
    } catch (TransformerException e) {
        e.printStackTrace();
    } finally {
        if (fos != null)
            try {
                fos.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
    }
}

解决方法:

首先,解释为什么会发生这种情况 – 由于您没有包含用于将XML文件加载到DOM对象的代码,因此可能会有所不同.

当您从文件中读取XML文档时,根据DOM规范,标记之间的空格实际上构成了有效的DOM节点.因此,XML解析器将每个这样的空白序列视为DOM节点(TEXT类型);

要摆脱它,我可以想到三种方法：

>将XML与模式相关联,然后在DocumentBuilderFactory上使用setValidating(true)和setIgnoringElementContentWhitespace(true).

(注意：setIgnoringElementContentWhitespace仅在解析器处于验证模式时才有效,这就是为什么必须使用setValidating(true))
>编写一个XSL来处理所有节点,过滤掉仅空白的TEXT节点.
>使用Java代码执行此操作：使用XPath查找所有仅空白的TEXT节点,遍历它们并从其父节点中删除每个节点(使用getParentNode().removeChild()).像这样的东西会做(doc将是你的DOM文档对象)：

XPath xp = XPathFactory.newInstance().newXPath();
NodeList nl = (NodeList) xp.evaluate("//text()[normalize-space(.)='']", doc, XPathConstants.NODESET);

for (int i=0; i < nl.getLength(); ++i) {
    Node node = nl.item(i);
    node.getParentNode().removeChild(node);
}

标签：code-cleanup,java,xml,carriage-return
来源： https://codeday.me/bug/20190930/1835543.html