编程语言
首页 > 编程语言> > python通过selenium获取网页信息可以绕过反扒系统

python通过selenium获取网页信息可以绕过反扒系统

作者:互联网

加载selenium控件:

pip intall selenium;pip show selenium

web下载地址:

chrome浏览器,chrome插件,谷歌浏览器下载,谈笑有鸿儒 (chromedownloads.net)

配置环境变量path:C:\Program Files\Python38

完整代码:


import requests
from bs4 import BeautifulSoup
from datetime import datetime
import json
import xlwt
import xlwings as xw
from selenium import webdriver
import time
from selenium.webdriver import Chrome, ChromeOptions

opt = ChromeOptions()            # 创建Chrome参数对象
opt.headless = True              # 把Chrome设置成可视化无界面模式,windows/Linux 皆可
driver = Chrome(options=opt)
# driver = webdriver.Chrome()
driver.get('http://emweb.eastmoney.com/PC_HSF10/OperationsRequired/Index?type=web&code=SH601600')
html=BeautifulSoup(driver.page_source,'html.parser')
time.sleep(2)
source =driver.page_source
# driver.find_element_by_id().send_keys()
# driver.find_element_by_name()
# driver.find_elements_by_class_name()

with open('rrBand.html', 'w') as f:
    f.write(source)
# print(html)
html.list=html.find_all('div',attrs={'class':'sckrox'})
print(html.list)
driver .quit()

# str=['中国铝业','中国核电','中国']
# print(str[1])

 

标签:python,selenium,driver,Chrome,html,反扒,import,find
来源: https://blog.csdn.net/u010719791/article/details/120224310