编程语言
首页 > 编程语言> > java-如何使用JSOUP或Coldfusion从URL中删除查询字符串和哈希值?

java-如何使用JSOUP或Coldfusion从URL中删除查询字符串和哈希值?

作者:互联网

这是示例:

当我解析HTML页面时.我收到重复的网址值,例如

> https://stackoverflow.com/questions/tagged/java?sort=featured&pageSize=50
> https://stackoverflow.com/questions/tagged/java#comments
> https://stackoverflow.com/questions/tagged/java#comment212

如何避免出现上述重复值?

我只需要这个URL https://stackoverflow.com/questions/tagged/java

解决方法:

我创建了一个辅助方法processURL(),它接受一个URL并返回一个包含查询标记(?)或井号(#)之前的所有内容的URL:

String processURL(String theURL) {
    int endPos;
    if (theURL.indexOf("?") > 0) {
        endPos = theURL.indexOf("?");
    } else if (theURL.indexOf("#") > 0) {
        endPos = theURL.indexOf("#");
    } else {
        endPos = theURL.length();
    }

    return theURL.substring(0, endPos);
}

String urlOne = "https://stackoverflow.com/questions/tagged/jav?#sort=featured&pageSize=50";
String urlTwo = "https://stackoverflow.com/questions/tagged/java#comments";

System.out.println(processURL(urlOne));
System.out.println(processURL(urlTwo));

输出:

https://stackoverflow.com/questions/tagged/java
https://stackoverflow.com/questions/tagged/java

标签:coldfusion,jsoup,java
来源: https://codeday.me/bug/20191119/2036640.html