python通过selenium获取网页信息可以绕过反扒系统
作者:互联网
加载selenium控件:
pip intall selenium;pip show selenium
web下载地址:
chrome浏览器,chrome插件,谷歌浏览器下载,谈笑有鸿儒 (chromedownloads.net)
配置环境变量path:C:\Program Files\Python38
完整代码:
import requests
from bs4 import BeautifulSoup
from datetime import datetime
import json
import xlwt
import xlwings as xw
from selenium import webdriver
import time
from selenium.webdriver import Chrome, ChromeOptions
opt = ChromeOptions() # 创建Chrome参数对象
opt.headless = True # 把Chrome设置成可视化无界面模式,windows/Linux 皆可
driver = Chrome(options=opt)
# driver = webdriver.Chrome()
driver.get('http://emweb.eastmoney.com/PC_HSF10/OperationsRequired/Index?type=web&code=SH601600')
html=BeautifulSoup(driver.page_source,'html.parser')
time.sleep(2)
source =driver.page_source
# driver.find_element_by_id().send_keys()
# driver.find_element_by_name()
# driver.find_elements_by_class_name()
with open('rrBand.html', 'w') as f:
f.write(source)
# print(html)
html.list=html.find_all('div',attrs={'class':'sckrox'})
print(html.list)
driver .quit()
# str=['中国铝业','中国核电','中国']
# print(str[1])
标签:python,selenium,driver,Chrome,html,反扒,import,find 来源: https://blog.csdn.net/u010719791/article/details/120224310