java-如何使用JSOUP或Coldfusion从URL中删除查询字符串和哈希值?
作者:互联网
这是示例:
当我解析HTML页面时.我收到重复的网址值,例如
> https://stackoverflow.com/questions/tagged/java?sort=featured&pageSize=50
> https://stackoverflow.com/questions/tagged/java#comments
> https://stackoverflow.com/questions/tagged/java#comment212
如何避免出现上述重复值?
我只需要这个URL https://stackoverflow.com/questions/tagged/java
解决方法:
我创建了一个辅助方法processURL(),它接受一个URL并返回一个包含查询标记(?)或井号(#)之前的所有内容的URL:
String processURL(String theURL) {
int endPos;
if (theURL.indexOf("?") > 0) {
endPos = theURL.indexOf("?");
} else if (theURL.indexOf("#") > 0) {
endPos = theURL.indexOf("#");
} else {
endPos = theURL.length();
}
return theURL.substring(0, endPos);
}
String urlOne = "https://stackoverflow.com/questions/tagged/jav?#sort=featured&pageSize=50";
String urlTwo = "https://stackoverflow.com/questions/tagged/java#comments";
System.out.println(processURL(urlOne));
System.out.println(processURL(urlTwo));
输出:
https://stackoverflow.com/questions/tagged/java
https://stackoverflow.com/questions/tagged/java
标签:coldfusion,jsoup,java 来源: https://codeday.me/bug/20191119/2036640.html