網(wǎng)頁爬蟲 - 為什么python模擬登陸 appannie一直返回503 code
問題描述
#-*-encoding:utf-8-*-import requests, xlwt, sysfrom bs4 import BeautifulSoupreload(sys)referer = 'https://www.appannie.com/account/login/?_ref=header'user_agent = (’Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36’)sys.setdefaultencoding(’utf-8’)header = {'User-Agent': user_agent, 'Referer': referer, 'Host': 'www.appannie.com', ’Connection’: ’keep-alive’, ’Accept’: ’application/json, text/plain,*/*’, ’Accept-Encoding’: ’gzip, deflate, sdch’, ’Accept-Language’: ’zh-CN,zh;q=0.8’, ’X-NewRelic-ID’: ’VwcPUFJXGwEBUlJSDgc=’, ’X-Requested-With’: ’XMLHttpRequest’, }def main(): url = ’https://www.appannie.com/account/login/’ # content = requests.get(url,headers = header).content # soup = BeautifulSoup(content,’lxml’) # key = soup.select() s = requests.Session() s.get(url,headers = header) key = s.cookies[’csrftoken’] data = { ’csrfmiddlewaretoken’: key , ’next’: ’/dashboard/home/’ , ’username’:’1195615991@qq.com’ , ’password’:’xxxxx’ } req = s.post(url,data = data) if 2 != req.status_code / 100 :raise Exception('Error while logging in, code: %d' % (req.status_code)) cookies = req.cookies n = ’2017-04-11’ url_1 = ’https://www.appannie.com/apps/google-play/top-chart/?country=US&category=game&device=&date={}’.format(n) req_1 = s.get(url_1,headers = header,cookies = cookies).content #print req_1 soup = BeautifulSoup(req_1,’lxml’) print soup # ids = soup.find_all(’span’) # for id in ids : # name = id.get(’title’) # print nameif __name__ == ’__main__’: main()
問題解答
回答1:兩個關(guān)鍵點(diǎn):1. headers的user-agent2. csrfmiddlewaretoken參數(shù)
# coding: utf-8import requestsurl = ’https://www.appannie.com/account/login’session = requests.Session()session.headers[’user-agent’] = ’Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36’session.get(url)token = session.cookies.get(’csrftoken’)data = { ’csrfmiddlewaretoken’: token, ’next’:’/dashboard/home/’, ’username’:’XXXX’, ’password’:’XXXX’}r = session.post(url, data)print r.status_code
相關(guān)文章:
1. 在應(yīng)用配置文件 app.php 中找不到’route_check_cache’配置項(xiàng)2. html按鍵開關(guān)如何提交我想需要的值到數(shù)據(jù)庫3. HTML 5輸入框只能輸入漢字、字母、數(shù)字、標(biāo)點(diǎn)符號?正則如何寫?4. gvim - 誰有vim里CSS的Indent文件, 能縮進(jìn)@media里面的5. 跟著課件一模一樣的操作使用tp6,出現(xiàn)了錯誤6. PHP類屬性聲明?7. objective-c - ios 怎么實(shí)現(xiàn)微信聯(lián)系列表 最好是swift8. javascript - 請教如何獲取百度貼吧新增的兩個加密參數(shù)9. html - 微信瀏覽器h5<video>標(biāo)簽問題10. java - 安卓接入微信登錄,onCreate不會執(zhí)行
