Caching Requests (Python)

Python’s Requests module is a great way for interacting with data over the internet, however when designing/building a Python program that uses Requests that can mean lots (and lots) of Request calls to test a function. This is not too great if you are testing over a poor internet connection, or on limited bandwidth or if you are making calls to a website that is not expecting the sudden increase in traffic.

As with most things Python though there is a solution, the requests-cache module.

Installing Requests-Cache

Depending on the version of Pip being used:

pip install requests-cache
pip3 install requests-cache

Using Requests-Cache

If you have already implemented the requests module in your Python program then you may want to use the install.cache() method of using requests-cache, as it takes just 2 lines to amend a Python program to use this:

import requests_cache

to import the requests-cache module and:

requests_cache.install_cache("cache_name", backend="backend_type", expire_after=EXPIRE_TIME_IN_SECONDS)

e.g. for my Covid-19 data research I imported the module and then added the requests_cache.install_cache line just after the list of modules being imported using the line:

requests_cache.install_cache("covid_data", backend="sqlite", expire_after=600)

This line created a SQLite database in the root of my project folder called covid_data.sqlite, and the cached response will be valid for 600 seconds (10 minutes). Depending on the frequency of change to the data the expire_after value could be anything but I would recommend keeping the seconds to something easy for humans to figure out (e.g. 60 seconds = a minute, 300 seconds = 5 minutes, 600 seconds = 10 minutes, 3600 seconds = 1 hour, 86400 seconds = 1 day).

Note: This time is only valid if the same response is made multiple times. If the request is only made once then then data could stay within the cache (using up space) until the request is called again. If this is an issue then use:

requests_cache.remove_expired_responses()

To remove any responses that have gone past their expiration time.

If a live response is needed without using the cache then the requests_cache.disabled() function can be used:

with requests_cache.disabled():
    r=requests.get("https://www.geektechstuff.com")

To clear the cache (e.g. if the data has been updated before the expiration time is up:

requests_cache.clear()

Back End Types

requests_cache supports a few different backend types to store its cache:

sqlite – default

mongodb – requires pymongo module

redis – requires redis

memory – stores the cache in a Python dict in the memory. This method is not persistent.

Want To Know More?

Check out: https://requests-cache.readthedocs.io/en/latest/index.html