python代理访问网页-泓源视野

python代理访问网页

使用随机代理访问需要访问的网页 暂时还没解决http协议和https协议问题
import requests
import time
time1 = time.time()
proxypool_url = 'http://129.151.235.55:5555/random'
target_url = 'https://byy3.com'
def get_random_proxy(): """ get random proxy from proxypool :return: proxy """ return requests.get(proxypool_url).text.strip()
def crawl(url, proxy, headers): """ use proxy to crawl page :param url: page url :param proxy: proxy, such as 8.8.8.8:8888 :return: html """ proxies = {'https': 'https://' + proxy} headers = headers return requests.get(url, proxies=proxies).text
def main(): """ main method, entry point :return: none """ headers = { 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8', 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36', } crawl.request_timeout = 15 proxy = get_random_proxy() print('get random proxy', proxy) html = crawl(target_url, proxy,headers) print(html)
if __name__ == '__main__':
     main()
本文由 泓源视野 作者:admin 发表,其版权均为 泓源视野 所有,文章内容系作者个人观点,不代表 泓源视野 对观点赞同或支持。如需转载,请注明文章来源。
11

发表评论

Protected with IP Blacklist CloudIP Blacklist Cloud
您是第8234479 位访客, 您的IP是:[18.216.233.58]