More Details of the Requests ModuleΒΆ

Once we run requests.get, we get a response object. It’s an instance of a class called Response that is defined in the requests module. We won’t look at it’s definition. Indeed, we haven’t even learned how to define our own classes at this point in the course. Think of it as analogous to the Turtle class. Each instance of the class has some attributes; different instances have different values for the same attribute. All instances can also invoke certain methods that are defined for the class.

Previously, we saw that a response object has an attribute (instance variable) .text, which contains the contents of the page, the stuff after all the HTTP headers. Response objects have some other useful attributes and methods that we can access. A few are used and explained below. Others will be introduced in later chapters.

import requests

page1 = requests.get("https://github.com/presnick/runestone")
page2 = requests.get("https://github.com/presnick/nonsense")
page3 = requests.get("http://github.com/presnick/runestone")

for p in [page1, page2, page3]:
    print "********"
    print "url:", p.url
    print "status:", p.status_code
    print "content type:", p.headers['Content-type']
    if len(p.text) > 1040:
        print "content snippet:", p.text[1000:1040]
    if len(p.history) > 0:
        print "redirection history"
        for h in p.history:
            print "  ", h.url, h.status_code

Here’s the output that is produced when I run that code.

$ python fetching.py
********
url: https://github.com/presnick/runestone
status: 200
content type: text/html; charset=utf-8
content snippet: ontent="@github" name="twitter:site" /><
********
url: https://github.com/presnick/nonsense
status: 404
content type: application/json; charset=utf-8
********
url: https://github.com/presnick/runestone
status: 200
content type: text/html; charset=utf-8
content snippet: ontent="@github" name="twitter:site" /><
redirection history
   http://github.com/presnick/runestone 301

First, consider the .url attribute. It is the URL that was actually accessed. We will see in a later chapter that requests.get lets us pass additional parameters that are used to construct the full URL, so this will be useful for seeing the full URL.

Next, consider the .status_code attribute.

The .headers attribute has as its value a dictionary consisting of keys and values. To find out all the headers, you can run the code and add a statement print p.headers.keys(). One of the headers is ‘Content-type’. For pages 1 and 3 its value is text/html; charset-utf-8. For page2, where we got an error, the contents are of type application/json; charset=utf-8.

The .text attribute we have seen before. It contains the contents of the file (or sometimes the error message).

The .history attribute contains a list of previous responses, if there were redirects. That list is empty, except for page3. For page3, we are able to see what happened in the original request: what the url was and the response code of 301.

To summarize, a Request object has the following useful attributes that can be accessed in your program:

  • .text
  • .url
  • .status_code
  • .headers
  • .history
Next Section - Using REST APIs