自如,房源价位图片识别
作者:互联网
通过测试发现房源价位图片链接具有重复性,
大多图片识别库或接口不仅效率低,且收费性的原因不适合对接我们的一些程序。
通过测试可以发现价位图片,有20张,那枚举出来只需要利用列表作hash取值即可快速拿到房源价位。
上代码
try: prices_url = response.xpath("//div[@class='Z_price']/i[1]/@style").get() pattern1 = re.compile(r'//static8.*png') prices_url = "http:" + pattern1.search(prices_url).group(0) except: prices_url = "" picture = "" rand_num_dic={ "http://static8.ziroom.com/phoenix/pc/images/2019/price/73ac03bb4d5857539790bde4d9301946.png":"7190864523", "http://static8.ziroom.com/phoenix/pc/images/2019/price/1b68fa980af5e85b0f545fccfe2f8af1.png":"8916702453", "http://static8.ziroom.com/phoenix/pc/images/2019/price/de345d4e39fa7325898a8fd858addbb8.png":"7263840195", "http://static8.ziroom.com/phoenix/pc/images/2019/price/939205287b8e01882b89273e789a77c5.png":"8015739624", "http://static8.ziroom.com/phoenix/pc/images/2019/price/a68621a4bca79938c464d8d728644642.png":"7034615982", "http://static8.ziroom.com/phoenix/pc/images/2019/price/6f8787069ac0a69b36c8cf13aacb016b.png":"6197450832", "http://static8.ziroom.com/phoenix/pc/images/2019/price/4eb5ebda7cc7c3214aebde816b10d204.png":"9570863124", "http://static8.ziroom.com/phoenix/pc/images/2019/price/486ff52ed774dbecf6f24855851e3704.png":"4780169253", "http://static8.ziroom.com/phoenix/pc/images/2019/price/a822d494f1e8421a2fb2ec5e6450a650.png":"3165849720", "http://static8.ziroom.com/phoenix/pc/images/2019/price/19003aac664523e53cc502b54a50d2b6.png":"4928730651", "http://static8.ziroom.com/phoenix/pc/images/2019/price/8e7a6d05db4a1eb58ff3c26619f40041.png":"3871290645", "http://static8.ziroom.com/phoenix/pc/images/2019/price/234a22e00c646d0a2c20eccde1bbb779.png":"1205837649", "http://static8.ziroom.com/phoenix/pc/images/2019/price/7995074a73302d345088229b960929e9.png":"0742138659", "http://static8.ziroom.com/phoenix/pc/images/2019/price/7ce54f64c5c0a425872683e3d1df36f4.png":"5137689402", "http://static8.ziroom.com/phoenix/pc/images/2019/price/bdf89da0338b19fbf594c599b177721c.png":"3164795280", "http://static8.ziroom.com/phoenix/pc/images/2019/price/93959ce492a74b6617ba8d4e5e195a1d.png":"5430879621", "http://static8.ziroom.com/phoenix/pc/images/2019/price/477571844175c1058ece4cee45f5c4b3.png":"2158097436", "http://static8.ziroom.com/phoenix/pc/images/2019/price/eb0d3275f3c698d1ac304af838d8bbf0.png":"3650489217", "http://static8.ziroom.com/phoenix/pc/images/2019/price/5c6750e29a7aae17288dcadadb5e33b1.png":"4593162870", "http://static8.ziroom.com/phoenix/pc/images/2019/price/b2451cc91e265db2a572ae750e8c15bd.png":"9162853470", } picture=rand_num_dic[prices_url] price='' pattern = re.compile(r'-\d+.*px') img_px_list = ['-0px', '-31.24px', '-62.48px', '-93.72px', '-124.96px', '-156.2px', '-187.44px', '-218.68px', '-249.92px', '-281.16px'] prices_lable = response.xpath("//div[@class='Z_price']/i/@style").extract() for index in range(1, len(prices_lable) + 1): num_1 = response.xpath("//div[@class='Z_price']/i[%s]/@style" % index).get() num_1 = pattern.search(num_1).group(0) num_1 = img_px_list.index(num_1) price += picture[num_1] else: price = response.xpath('//span[@class="price single"]/text()').extract()[0].strip()
标签:http,ziroom,static8,价位,房源,2019,自如,price,png 来源: https://blog.csdn.net/weixin_44145864/article/details/118724223