My notes on using concurrent.futures in Python. VS executor.submit()

There are mainly two different ways to use executor for parallel processing, the first is via, and the second way is via executor.submit() combined with concurrent.futures.as_completed().

Here is a simple example to demonstrate this:

import time
from contextlib import contextmanager
from concurrent.futures import ThreadPoolExecutor
import concurrent.futures

def report_time(des):
    start = time.time()
    end = time.time()

    print(f"Time for {des}: {end-start}")

def square(x):
    return x*x

def main():
    num = 10

    with report_time("using"):
        # with ThreadPoolExecutor(max_workers=10) as executor:
        with ThreadPoolExecutor() as executor:
            res =, range(num))
        res = list(res)

    with report_time('using executor.submit'):
        with ThreadPoolExecutor() as executor:
            my_futures = [executor.submit(square, x) for x in range(num)]
            res = []
            for future in concurrent.futures.as_completed(my_futures):

if __name__ == "__main__":

Note that will return an iterator instead of plain list, and the order of results corresponds to the order that arguments are provided for the function we want to run in parallel.

If we use executor.submit(), it will return a future object. We can later access the function return value via the future object. Unlike map(), we then use concurrent.futures.as_completed(my_futures) to make sure that the functions are actually executed and return the results. Any future that gets finished first will be returned first. So there is no guarantee of the result order anymore! For example, we may get the following result for res:

[4, 9, 1, 16, 25, 49, 36, 0, 64, 81]


For ThreadPoolExecutor(), there is parameter max_worker to specify the max number of threads to use. According to the official doc, it is set to min(32, os.cpu_count() + 4) for Python 3.8 and os.cpu_count() * 5 for Python version below 3.8 and above 3.5.

In some cases, the default max_worker may be too large to cause serious issues. For example, when I use ThreadPoolExecutor() to request a web service with default parameters, my code runs for a few requests, then it hangs indefinitely without any progress. I have to reduce max_worker to about 50 to run the code smoothly. So in our real projects, we should tweak the value of max_workers to fit our needs.