Tuesday, March 13, 2012

Bad Github api v2

Python basic authentication using urllib2.HTTPBasicAuthHandler() did not work for me on the Github api v2. On the users page I finally found an easy alternate authentication method:

http://github.com/api/v2/json/user/show?login=defunkt&token=XXX

Thank you very much Dustin for doing such a great job on py-github, in which deeply buried I found this gem.

I wonder if this would have been harder or easier in Java?

update (2012-03-13): Adding a header with the key "Authentication" and the value "Basic username:password" encoded with base64 also works, but why?

>>> import base64
>>> import urllib2
>>> url = "http://your.url.com"
>>> user = "your-username"
>>> token = "your-password"
>>> req = urllib2.Request(url)
>>> req.add_header('Authorization'
...   'Basic ' + base64.b64encode("%s:%s" %(user + '/token',
...   token)).strip()
>>> data = urllib2.urlopen(req).read()

This makes me think that HTTPBasicAuthHandler should work, at least according to this site on basic authentication and this Python page on fetching internet resources.

I also found PyCurl (*) which probably would have helped a lot! I've never had any problems with libcurl. Another resource is ask/python-github2 which uses httplib2 (**). I have also read on Stack Overflow about Requests which actually has a Github api example on their PyPi page. New is always better, so that's probably the way to go. In fact the Requests site goes so far as to say that the urllib2 API is "thoroughly broken" which in my limited experience is the truth.

(*) Update (2012-03-14): PycURL is not actively maintained; this package (7.19) was last updated in 2008. You must install libcurl to use PycURL. If you try to use the tarball from the PycURL website, it requires the install option --curl-dir=c:\your\src\dir, which is the folder of your libcurl files. The setup.py file looks for libcurl.lib, but in the current release of libcurl (7.24), this file is renamed, so the setup fails. Same for ssl setup. An alternative installation for Windows (32/64-bit Python 2.6/7) is available as an executable from Christoph Gohlke's Python extensions. On Ubuntu Linux, python-pycurl is maintained on launchpad.

OK, one more totally obvious way to send basic authentication is to include it in the url and open it with urllib not urllib2.

>>> import urllib
>>> urllib.urlopen("http://username:password@your.url.com")

Python, I'm not as psyched as I once was. Something this simple shouldn't be so totally non-obvious to so many people. I've starred no less than ten SO posts all related to the same thing, and not once has anyone gotten urllib2 to work the way it's supposedly meant.

Answered! I finally figured it out! Here is the Github v2 api response:

Server: nginx/1.0.13
Date: Wed, 14 Mar 2012 07:37:16 GMT
Content-Type: application/json; charset=utf-8
Connection: close
Status: 401 Unauthorized
X-RateLimit-Limit: 60
X-Frame-Options: deny
X-RateLimit-Remaining: 59
X-Runtime: 7
Content-Length: 26
Cache-Control: no-cache

Python urllib2 is expecting to see "www-authenticate" but since it's not there, it does not send the login info. Problem solved. I saw this SO post but it was also in the missing urllib2 manual all along.

(**) Update (2012-03-15): BTW: httplib2 has the exact same problem as urllib2 according the SO post I mentioned above; the site must return "WWW-Authenticate" in the header or no credentials are sent.
Fork me on GitHub