请教用python urllib库爬虫模拟登陆启信宝遇到json格式的请求,怎么解决?
网络老鼠 2018-03-16 07:43:10 我只是想模拟登陆,然后爬取返回的首页,我已经添加了浏览器头信息等数据,关键是提交的用户密码是json格式,不知道我写的代码对不对?网上查了应该用json.dumps()封装,但还是返回500错误。
用F12分析的登录请求:
{acc: "13961700000", pass: "12345", captcha: {isTrusted: true}}
点击"view source"显示:{"acc":"13961700000","pass":"12345","captcha":{"isTrusted":true}}
多个几个引号,不知道啥回事。
我的代码如下:
import urllib.request
import urllib.error
import json
url = "http://www.qixin.com/api/user/login"
postdata ={"acc":"13961740000","pass":"12345","captcha":{"isTrusted":True}}
headers={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.119 Safari/537.36"}
headers["Referer"]="www.qixin.com"
headers["Host"]="www.qixin.com"
headers["Accept-Encoding"]="gzip, deflate"
headers["Accept-Language"]="zh-CN,zh;q=0.9"
headers["dc49417fe4f34f86b0fe"]="25c5e641e251292018090b3407797217f8c779abc6ca19dfda2706c53e0d784a4bd768b38e83732691cec6b1d2000aa6925e05b29338e02230922ae0c37c633a"
headers["Content-Type"]="application/json;charset=UTF-8"
headers["Origin"]="http://www.qixin.com"
headers["X-Requested-With"]="XMLHttpRequest"
req = urllib.request.Request(url,json.dumps(postdata).encode(encoding='UTF8'),headers,method='POST')
print("开始爬取...")
try:
data=urllib.request.urlopen(req).read()
print(len(data))
fh=open(r"C:\Users\emouse\Desktop\test.html","wb")
fh.write(data)
fh.close()
except urllib.error.URLError as e:
if hasattr(e,"code"):
print(e.code)
if hasattr(e,"reason"):
print(e.reason)
请大神们指点!!!