Python Urllib.error.httperror Http Error 403 Forbidden
Contents |
here for a quick overview of the site Help Center Detailed answers to any questions you might urllib2.httperror http error 403 forbidden python have Meta Discuss the workings and policies of this site About
Python Requests 403 Forbidden
Us Learn more about Stack Overflow the company Business Learn more about hiring developers or posting python requests 403 error ads with us Stack Overflow Questions Jobs Documentation Tags Users Badges Ask Question x Dismiss Join the Stack Overflow Community Stack Overflow is a community of 6.2
Raise Httperror(req.full_url, Code, Msg, Hdrs, Fp) Urllib.error.httperror: Http Error 403: Forbidden
million programmers, just like you, helping each other. Join them; it only takes a minute: Sign up Python 3.4 urllib.request error (http 403) up vote 8 down vote favorite I'm trying to open and parse a html page. In python 2.7.8 I have no problem: import urllib url = "https://ipdb.at/ip/66.196.116.112" html = urllib.urlopen(url).read() and everything python requests response 403 is fine. However I want to move to python 3.4 and there I get HTTP error 403 (Forbidden). My code: import urllib.request html = urllib.request.urlopen(url) # same URL as before File "C:\Python34\lib\urllib\request.py", line 153, in urlopen return opener.open(url, data, timeout) File "C:\Python34\lib\urllib\request.py", line 461, in open response = meth(req, response) File "C:\Python34\lib\urllib\request.py", line 574, in http_response 'http', request, response, code, msg, hdrs) File "C:\Python34\lib\urllib\request.py", line 499, in error return self._call_chain(*args) File "C:\Python34\lib\urllib\request.py", line 433, in _call_chain result = func(*args) File "C:\Python34\lib\urllib\request.py", line 582, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden It work for other URLs which don't use https. url = 'http://www.stopforumspam.com/ipcheck/212.91.188.166' is ok. python python-3.x urllib share|improve this question edited Feb 8 '15 at 16:35 falsetru 174k22218268 asked Feb 8 '15 at 15:57 Belial 188110 add a comment| 2 Answers 2 active oldest votes up vote 16 down vote accepted It seems like the site does not like the user agent of Python 3.x. Specifying User-Ag
here for a quick overview of the site Help Center Detailed answers to any questions you might have Meta Discuss the workings
Urllib2 User Agent
and policies of this site About Us Learn more about Stack Overflow
Yolk Urllib2.httperror: Http Error 403: Must Access Using Https Instead Of Http
the company Business Learn more about hiring developers or posting ads with us Stack Overflow Questions Jobs Documentation Tags requests python user agent Users Badges Ask Question x Dismiss Join the Stack Overflow Community Stack Overflow is a community of 6.2 million programmers, just like you, helping each other. Join them; it only http://stackoverflow.com/questions/28396036/python-3-4-urllib-request-error-http-403 takes a minute: Sign up urllib2.HTTPError: HTTP Error 403: Forbidden up vote 41 down vote favorite 25 I am trying to automate download of historic stock data using python. The URL I am trying to open responds with a CSV file, but I am unable to open using urllib2. I have tried changing user agent as specified in few questions http://stackoverflow.com/questions/13303449/urllib2-httperror-http-error-403-forbidden earlier, I even tried to accept response cookies, with no luck. Can you please help. Note: The same method works for yahoo Finance. Code: import urllib2,cookielib site= "http://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/getHistoricalData.jsp?symbol=JPASSOCIAT&fromDate=1-JAN-2012&toDate=1-AUG-2012&datePeriod=unselected&hiddDwnld=true" hdr = {'User-Agent':'Mozilla/5.0'} req = urllib2.Request(site,headers=hdr) page = urllib2.urlopen(req) Error File "C:\Python27\lib\urllib2.py", line 527, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) urllib2.HTTPError: HTTP Error 403: Forbidden Thanks for your assistance python http urllib share|improve this question edited Nov 9 '12 at 7:14 Sudar 5,407124790 asked Nov 9 '12 at 6:51 kumar 8491810 Are you use windows as platform ? –Denis Nov 9 '12 at 7:08 add a comment| 2 Answers 2 active oldest votes up vote 77 down vote accepted By adding a few more headers I was able to get the data: import urllib2,cookielib site= "http://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/getHistoricalData.jsp?symbol=JPASSOCIAT&fromDate=1-JAN-2012&toDate=1-AUG-2012&datePeriod=unselected&hiddDwnld=true" hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3', 'Accept-Encoding': 'none', 'Accept-Language': 'en-US,en;q=0.8', 'Connection': 'keep-alive'} req = urllib2.Request(site, headers=hdr) try: page = urllib2.urlopen(req) except urllib2.HTTPError, e: print e.fp.read() content = page.read() print content Actually, it works with just this one additional header: 'Accept': 'text/html,application/xht
here for a quick overview of the site Help Center Detailed answers to any questions you might have Meta Discuss the workings http://stackoverflow.com/questions/13055208/httperror-http-error-403-forbidden and policies of this site About Us Learn more about Stack Overflow http://stackoverflow.com/questions/34957748/http-error-403-forbidden the company Business Learn more about hiring developers or posting ads with us Stack Overflow Questions Jobs Documentation Tags Users Badges Ask Question x Dismiss Join the Stack Overflow Community Stack Overflow is a community of 6.2 million programmers, just like you, helping each other. Join them; it only http error takes a minute: Sign up HTTPError: HTTP Error 403: Forbidden up vote 7 down vote favorite 4 I making a python script for personal use but it's not working for wikipedia... This work: import urllib2, sys from bs4 import BeautifulSoup site = "http://youtube.com" page = urllib2.urlopen(site) soup = BeautifulSoup(page) print soup This not work: import urllib2, sys from bs4 import http error 403 BeautifulSoup site= "http://en.wikipedia.org/wiki/StackOverflow" page = urllib2.urlopen(site) soup = BeautifulSoup(page) print soup This is the error: Traceback (most recent call last): File "C:\Python27\wiki.py", line 5, in
here for a quick overview of the site Help Center Detailed answers to any questions you might have Meta Discuss the workings and policies of this site About Us Learn more about Stack Overflow the company Business Learn more about hiring developers or posting ads with us Stack Overflow Questions Jobs Documentation Tags Users Badges Ask Question x Dismiss Join the Stack Overflow Community Stack Overflow is a community of 6.2 million programmers, just like you, helping each other. Join them; it only takes a minute: Sign up HTTP Error 403: Forbidden up vote -1 down vote favorite 1 I am trying to download a pdf, however I get the following error: HTTP Error 403: Forbidden I am aware that the server is blocking for whatever reason, but I cant seem to find a solution. Please help. import urllib.request import urllib.parse import requests def download_pdf(url): full_name = "Test.pdf" urllib.request.urlretrieve(url, full_name) try: url = ('http://papers.xtremepapers.com/CIE/Cambridge%20IGCSE/Mathematics%20(0580)/0580_s03_qp_1.pdf') print('initialized') hdr = {} hdr = { 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36', 'Content-Length': '136963', } print('HDR recieved') req = urllib.request.Request(url, headers=hdr) print('Header sent') resp = urllib.request.urlopen(req) print('Request sent') respData = resp.read() download_pdf(url) print('Complete') except Exception as e: print(str(e)) python http python-requests urllib share|improve this question asked Jan 22 at 23:38 Z.Chen 204 If the server is blocking, there's probably not an easy way through. Forbidden means that you are not allowed. –Zizouz212 Jan 22 at 23:41 add a comment| 1 Answer 1 active oldest votes up vote 2 down vote accepted You seem to have already realised this; the remote server is apparently checking the user agent header and rejecting requests from Python's urllib. But urllib.request.urlretrieve() doesn't allow you to change the HTTP headers, however, you can use urllib.request.URLopener.retrieve(): import urllib.request opener = urllib.request.URLopener() opener.addheader('User-Agent', 'whatever') filename, headers = opener.retrieve(url, 'Test.pdf') N.B. You are using Python 3 and these functions are now considered part of the "Legacy interface", and URLopener has been deprecated. For tha