bs4爬取网页图片
作者:互联网
最近学习爬虫,利用bs4批量爬取图片,由于每个父链接点进去的子链接命名格式不一样,所以暂时没有找到把所有父链接下子链接图片全部下载的方法,只是下载了每个父链接点进去的第一个子链接图片
import requests
from bs4 import BeautifulSoup
url = "https://desk.zol.com.cn/"
resp = requests.get(url)
# print(resp.text)
page = BeautifulSoup(resp.text, "html.parser")
alist = page.find("ul", class_ = "pic-list2 clearfix").find_all("a")
# print(alist)
for a in alist:
# print(a.get("href"))
href = url + a.get("href")
print(href)
# 拿到子页面的源代码
child_resp = requests.get(href)
child_page = BeautifulSoup(child_resp.text, "html.parser")
# 拿到图片下载地址
image = child_page.find("img", id="bigImg")
image_src = image.get("src")
#下载图片,写入文件内
image_resp = requests.get(image_src)
image_name = "下载的图片/" + image_src.split("/")[-1]
with open(image_name, mode="wb") as f:
f.write(image_resp.content)
标签:src,网页,get,bs4,image,链接,爬取,href,resp 来源: https://blog.csdn.net/weixin_43496049/article/details/121052126