首页 > 其他分享> > 爬虫：爬取了wallpaper练练手

爬虫：爬取了wallpaper练练手

2022-02-02 23:01:59 作者：互联网

爬了个wallpaper练练手

刚学了点爬虫，爬了个图片非常好看的网站：https://wallhaven.cc/hot

比较适合入门，欢迎交流在这里插入图片描述

import requests
from bs4 import BeautifulSoup
import time

# 目标网页url
url = "https://wallhaven.cc/hot"

# 请求响应
resp = requests.get(url)
resp.encoding = "utf-8"

# 解析网页？
bsobj = BeautifulSoup(resp.text, "html.parser")
imglist = bsobj.find("section", attrs={"class":"thumb-listing-page"}).find_all("a", attrs={"class":"preview"})
# print(imglist[:4])
for img in imglist:
    img = str(img)
    child_url = img[img.index("https"):img.index("\" target")]
    # print(child_url)
    child_resp = requests.get(child_url)
    # print(child_resp)
    child_bsobj = BeautifulSoup(child_resp.text, "html.parser")
    before_src = child_bsobj.find("img", attrs={"id":"wallpaper"})
    # print(before_src.get("src"))
    src = before_src.get("src")
    src_file = requests.get(src)

    img_name = src.split("/")[-1]
    with open("wallpaper/" + img_name, mode="wb") as f:
        f.write(src_file.content)
    print("finish 111")
    child_resp.close()
    time.sleep(1)

print("all finish !!!")
# print(resp)

resp.close()

标签：src,img,url,resp,wallpaper,爬取,练练手,child,print
来源： https://blog.csdn.net/idl1ng/article/details/122772910