从js中提取数据
作者:互联网
<script language="JavaScript" type="text/javascript+gk-onload"> SKART = (SKART) ? SKART : {}; SKART.analytics = SKART.analytics || {}; SKART.analytics["category"] = "television"; SKART.analytics["vertical"] = "television"; SKART.analytics["supercategory"] = "homeentertainmentlarge"; SKART.analytics["subcategory"] = "television"; </script>
You can use the Selector
's built-in support for regular expressions through re()
:
pattern = r'SKART\.analytics\["category"\] = "(\w+)";' response.xpath('//script[@type="text/javascript+gk-onload"]').re(pattern)
Demo (using scrapy shell
):
$ scrapy shell index.html In [1]: pattern = r'SKART\.analytics\["category"\] = "(\w+)";' In [2]: response.xpath('//script[@type="text/javascript+gk-onload"]').re(pattern) Out[2]: [u'television']
https://stackoverflow.com/questions/29163395/scrapy-and-xpath-to-extract-data-from-javascript-code
标签:category,television,提取,re,pattern,js,analytics,SKART,数据 来源: https://www.cnblogs.com/bamboozone/p/10411704.html