Know to there is white of a lot of corpses on small gain, make a holiday for star data flow, seek to live on as desire very intense data Yuan, consider carefully dare not gather up repeatedly discharge star, so I raise medium helper target today is:
" next ***** 8 "
Can add QQ source group 1004391443, plane big fight, Yan Zhi makes component implement, hit brick little game, red bag to remind a god implement, young lady elder sister professions god implement wait for specific solid example project, have clear source, have corresponding file
One, moving environment
Windows, python3
The package of need has downloaded in Cmd: Pip Install ***
Next, guide in front of code: Import ***
Import Requestsfrom Html.parser Import HTMLParserimport Jsonimport Timefrom Bs4 Import BeautifulSoupimport Jsonimport Pandas As Pd
2, the Cookie source that gets page of small gain vermicelli made from bean starch
Open his small gain homepage above all, enter development mode by F12, choose Network, f5 refresh page, resemble such:
"Doc" chooses in option card, tick off choose page network address, find "Request Headers" , resemble such:
Can get a source through "Request Headers" :
# function Def Get_html(Header, the_url) : R = Requests.get(url=the_url, the first parameter of function of Headers=header) Parser = HTMLParser() Parser.feed(r.text) Html_str = R.text Return Html_str# : "Request Headers" corresponding field is stuck come over, change you to want to climb obtain an address oh Header={"Accept" : "Text/html, application/xhtml+xml, application/xml;q=0.9, image/webp, image/apng, */*;q=0.8" , "Accept-Encoding" : "Gzip, deflate, br" , "Accept-Language" : "Zh-CN, zh;q=0.9" , "Connection" : "Keep-alive" , "Cookie" :" SINAGLOBAL=8942290497361.736.1542873594539; UM_distinctid=1684f2253455f-07fefa1f3a5ef8-47e1039-144000-1684f2253463f8; UOR= , , www.duba.com; Un=yuzidesky1128@sina.cn; Wvr=6; Ugrow-G0=9642b0b34b4c0d569ed7a372f8823a8e; ALF=1587692149; SSOLoginState=1556156149; SCF=AmrcjCjvFI3VTtRcnw5XSEpu50N99C78GjWRXyTgpAiZyfrYPOUKsG3XcpfQFXmoHSYhPc9zkby1VsW0nEGa35o. ; SUB=_2A25xxX6lDeRhGedJ7FAX8CnPyz2IHXVSs9dtrDV8PUNbmtAKLU_YkW9NUdIDy4VrqO7uz81FV-aLXHbJc1pOGAyW; SUBP=0033WrSXqPxfM725Ws9jqgMF55529P9D9W5.xNDWU.nUY18EZ3DRMKKg5JpX5KzhUgL.Fo2NS0zcehM0eh22dJLoIp9jdN9Li- - NiK.piKLhi- - Fi-82iK.7; SUHB=0tKrG81Etv1yct; TC-V5-G0=841d8e04c4761f733a87c822f72195f3; _s_tentry=login.sina.com.cn; Apache=6519093007221.037.1556156193431; Wb_view_log_1772607301=1536*8641.125; ULV=1556156193448:33:4: 2: 6519093007221.037.1556156193431:1556094384932; TC-Page-G0=cdcf495cbaea129529aa606e7629fea7 | 1556165109 | 1556164910; Webim_unReadCount=%7B%22time%22%3A1556167207612%2C%22dm_pub_total%22%3A2%2C%22chat_group_pc%22%3A0%2C%22allcountNum%22%3A2%2C%22msgbox%22%3A0%7D