python-如何逃避实际命名的BeautifulSoup ISO标记中的父属性?
作者:互联网
好的,这很有趣.这是XML:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<parent>
<groupId>com.parent</groupId>
<artifactId>parent</artifactId>
<version>1.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>
<build>
<sourceDirectory>src</sourceDirectory>
</build>
我想使用简单的BeautifulSoup分层表示法到达实际上名为< parent>的节点.但parent实际上是此API中的保留属性标签.
with open(pom) as pomHandle:
soup = BeautifulSoup(pomHandle)
#this returns the proper build node
buildNode = soup.project.build
#this does not return the proper parent node but the XML parent of the project node
#(which is the whole doc) because 'parent' is reserved
parentNode = soup.project.parent
如何克服此限制?
解决方法:
您可以使用find()代替:
soup.project.find('parent')
从本质上讲,这是同一件事,因为BeautifulSoup在Tag类的__getattr __()方法中使用了find-the-under.
希望能有所帮助.
标签:dom,beautifulsoup,xml,python,xml-parsing 来源: https://codeday.me/bug/20191122/2058184.html