取消Flic.kr URL
作者:互联网
我有一个Python脚本,它根据发布的答案here取消URL的缩短.到目前为止,它运行得很好,例如,与youtu.be,goo.gl,t.co,bit.ly和tinyurl.com一起使用.但是现在我注意到,它对于Flickr自己的URL缩短器flic.kr不起作用.
例如,当我输入网址时
https://flic.kr/p/qf3mGd
进入浏览器,我可以正确地重定向到
https://www.flickr.com/photos/106783633@N02/15911453212/
但是,当用于与Python脚本取消缩短相同的URL时,我得到以下重定向
https://flic.kr/p/qf3mgd
http://www.flickr.com/photo.gne?short=qf3mgd
http://www.flickr.com/signin/?acf=%2Fphoto.gne%3Fshort%3Dqf3mgd
https://login.yahoo.com/config/login?.src=flickrsignin&.pc=8190&.scrumb=[...]
因此最终在Yahoo登录页面上结束.顺便说一句,Unshort.me可以正确缩短URL.我在这里想念什么?
这是我脚本的完整源代码.我偶然发现了一些带有原始脚本的病理病例:
import urlparse
import httplib
def unshorten_url(url, max_tries=10):
return __unshorten_url(url, [], max_tries)
def __unshorten_url(url, check_urls, max_tries):
if max_tries == 0:
if len(check_urls) > 0:
return check_urls[0]
return url
if url in check_urls:
return url
unshortended = ''
try:
parsed = urlparse.urlparse(url)
h = httplib.HTTPConnection(parsed.netloc)
h.request('HEAD', url)
except:
return None
try:
response = h.getresponse()
except:
return url
if response.status/100 == 3 and response.getheader('Location'):
unshortended = response.getheader('Location')
else:
return url
#print max_tries, unshortended
if unshortended != url:
if 'http' not in unshortended:
return url
check_urls.append(url)
return __unshorten_url(unshortended, check_urls, (max_tries-1))
else:
return unshortended
print unshorten_url('http://t.co/5skmePb7gp')
编辑:带有t.co URL的完整工作示例
解决方法:
我以这种方式使用Request [0]而不是httplib,它可以很好地与https://flic.kr/p/qf3mGd之类的网址配合使用:
>>> import requests
>>> requests.head("https://flic.kr/p/qf3mGd", allow_redirects=True, verify=False).url
u'https://www.flickr.com/photos/106783633@N02/15911453212/'
[0] http://docs.python-requests.org/en/latest/
标签:url,url-shortener,flickr,python 来源: https://codeday.me/bug/20191120/2046298.html