Skip to main content
  1. Posts/

How to Download Image from URL using Python

··328 words·2 mins·
Table of Contents

Recently, I want to download some images using Python. This is what I’ve learned after the survey.

Using urllib package
#

The native and naive way is to use urllib.request module to download an image.

import urllib.request

url = "https://cdn.pixabay.com/photo/2020/05/12/17/04/wind-turbine-5163993_960_720.jpg"

r = urllib.request.urlopen(url)
with open("wind_turbine.jpg", "wb") as f:
    f.write(r.read())

However, the above code may error out with following message:

urllib.error.HTTPError: HTTP Error 403: Forbidden

In this case, we need to add a HTTP header to the request:

import urllib.request

# The following way works. Ref: https://stackoverflow.com/a/45358832/6064933
req = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
with open("wind_turbine.jpg", "wb") as f:
    with urllib.request.urlopen(req) as r:
        f.write(r.read())

Using requests package
#

A better way is to use requests package. Here is a simple example to download an image using requests:

import requests

url = "https://cdn.pixabay.com/photo/2020/05/12/17/04/wind-turbine-5163993_960_720.jpg"

r = requests.get(url)
with open("wind-turbine.jpg", "wb") as f:
    f.write(r.content)

Downloading large files with streaming
#

response.iter_content
#

In the above code, all content of the image will be read into memory at once. If the image is large, it may consume too much memory.

Alternatively, we can set stream parameter to True to stream request. In this case, only the response header is downloaded. We can retrieve the image in a whole using response.content1 or chunk by chunk by using response.iter_content method:

# Using requests to download large files.
with requests.get(url, stream=True) as r:
    with open("wind-turbine.jpg", "wb") as f:
        for chunk in r.iter_content(chunk_size=1024):
            if chunk:
                f.write(chunk)

response.raw
#

When stream is True, we can also use response.raw to stream the download. response.raw is a file-like object. With the help of shutil.copyfileobj(), we can save the image like this:

# using r.raw
with requests.get(url, stream=True) as r:
    with open("wind-turbine.jpg", "wb") as f:
        r.raw.decode_content = True
        shutil.copyfileobj(r.raw, f)
        # or f.write(r.raw.read())

References
#


  1. You may want to avoid this for large files! ↩︎

Related

Downloading Images Faster with requests Sessions
··457 words·3 mins
Build Web API with Flask --- Work with JSON-like Dict
··435 words·3 mins
Build Web API with Flask -- Post and Receive Image
··657 words·4 mins