首页 > 编程语言> > Java爬取网页指定内容

Java爬取网页指定内容

2022-03-03 09:33:25 作者：互联网

爬取网页文字：

import org.jsoup.Jsoup;
import org.junit.Test;

import java.io.IOException;

public class Crawling {


    public static void Test() throws IOException {
        Jsoup.connect("https://soccer.hupu.com/").get().body().
                getElementsByClass("list-item"). //class="list-item-title"
                forEach(e->{
            System.out.println(e.text());
        });

    }

    public static void main(String[] args) {
        try {
            Test();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

}

爬取网页图片地址：

import org.jsoup.Jsoup;
import org.junit.Test;

import java.io.IOException;

public class Crawling {

    public static void Test() throws IOException {
        Jsoup.connect("https://soccer.hupu.com/").get().body().
                getElementsByClass("list-item-img").
                forEach(e->{
            System.out.println(e.attr("src")); //src标签图片地址
        });
    }

    public static void main(String[] args) {
        try {
            Test();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

}

标签：static,网页,爬取,Jsoup,IOException,import,Test,Java,public
来源： https://www.cnblogs.com/subtlman/p/15958233.html