Java爬取网页指定内容
作者:互联网
- 爬取网页文字:
-
import org.jsoup.Jsoup;
import org.junit.Test;
import java.io.IOException;
public class Crawling {
public static void Test() throws IOException {
Jsoup.connect("https://soccer.hupu.com/").get().body().
getElementsByClass("list-item"). //class="list-item-title"
forEach(e->{
System.out.println(e.text());
});
}
public static void main(String[] args) {
try {
Test();
} catch (IOException e) {
e.printStackTrace();
}
}
} - 爬取网页图片地址:
-
import org.jsoup.Jsoup;
import org.junit.Test;
import java.io.IOException;
public class Crawling {
public static void Test() throws IOException {
Jsoup.connect("https://soccer.hupu.com/").get().body().
getElementsByClass("list-item-img").
forEach(e->{
System.out.println(e.attr("src")); //src标签图片地址
});
}
public static void main(String[] args) {
try {
Test();
} catch (IOException e) {
e.printStackTrace();
}
}
}
标签:static,网页,爬取,Jsoup,IOException,import,Test,Java,public 来源: https://www.cnblogs.com/subtlman/p/15958233.html