首页 > 其他分享> > bs4爬取网页图片

bs4爬取网页图片

2021-10-30 16:03:48 作者：互联网

最近学习爬虫，利用bs4批量爬取图片，由于每个父链接点进去的子链接命名格式不一样，所以暂时没有找到把所有父链接下子链接图片全部下载的方法，只是下载了每个父链接点进去的第一个子链接图片

import requests
from bs4 import BeautifulSoup


url = "https://desk.zol.com.cn/"
resp = requests.get(url)

# print(resp.text)
page = BeautifulSoup(resp.text, "html.parser")
alist = page.find("ul", class_ = "pic-list2 clearfix").find_all("a")
# print(alist)
for a in alist:
    # print(a.get("href"))
    href = url + a.get("href")
    print(href)
    # 拿到子页面的源代码
    child_resp = requests.get(href)
    child_page = BeautifulSoup(child_resp.text, "html.parser")
    # 拿到图片下载地址
    image = child_page.find("img", id="bigImg")
    image_src = image.get("src")
    #下载图片，写入文件内
    image_resp = requests.get(image_src)
    image_name = "下载的图片/" + image_src.split("/")[-1]
    with open(image_name, mode="wb") as f:
        f.write(image_resp.content)

标签：src,网页,get,bs4,image,链接,爬取,href,resp
来源： https://blog.csdn.net/weixin_43496049/article/details/121052126