I'm working on a Python script where I need to make an initial request to obtain an ID. Once I have the ID, I need to make several additional requests to get data related to that ID. I understand that these subsequent requests can be made asynchronously to improve performance. However, I'm not sure how to implement this effectively.
Here's a simplified version of my current synchronous approach:
import requests
# Initial request to get the ID
response = requests.get('https://api.example.com/get_id')
id = response.json()['id']
# Subsequent requests to get data related to the ID
data1 = requests.get(f'https://api.example.com/data/{id}/info1').json()
data2 = requests.get(f'https://api.example.com/data/{id}/info2').json()
data3 = requests.get(f'https://api.example.com/data/{id}/info3').json()
# Processing the data
process_data(data1, data2, data3)
I would like to make the requests to info1, info2, and info3 asynchronously. How can I achieve this using asyncio or any other library?
I've looked into httpx, but I'm not sure how to structure the code correctly. Any help or example code would be greatly appreciated!
Similar but using a task group (python >= 3.11)
import httpx
import asyncio
async def main():
    async with httpx.AsyncClient() as client:
        response = await client.get('https://api.example.com/get_id')  
        id = response.json()['id']
        async with asyncio.TaskGroup() as tg:
            task1 = tg.create_task(client.get(f'https://api.example.com/data/{id}/info1'))
            task2 = tg.create_task(client.get(f'https://api.example.com/data/{id}/info2'))
            task3 = tg.create_task(client.get(f'https://api.example.com/data/{id}/info3'))
    # Processing the data
    process_data(task1.result(), task2.result(), task3.result())
        
There is nothing preventing one from using async systems synchronously, one simply has to await any actions they want to finish prior to the next scheduled matter.
The main difficulty is in making an application async compatible, which can be significantly more challenging.
I have implemented the surrounding structure for the task you described below. Wider considerations for the surrounding code & structure would likely be wise, if you wish to implement this in a long-term project.
import httpx
import asyncio
async def main():
    async with httpx.AsyncClient() as client:
        # Initial request to get the ID
        response = await client.get('https://api.example.com/get_id')  
        id = response.json()['id']
        print('done with sync task')
        
        # Subsequent requests to get data related to the ID
        task1 = asyncio.create_task(client.get(f'https://api.example.com/data/{id}/info1'))
        task2 = asyncio.create_task(client.get(f'https://api.example.com/data/{id}/info2'))
        task3 = asyncio.create_task(client.get(f'https://api.example.com/data/{id}/info3'))
        
        # Only await the tasks after having scheduled them all. With larger task counts asyncio.gather() should be considered.
        data1 = (await task1).json()
        data2 = (await task2).json()
        data3 = (await task3).json()
        print(data1, data2, data3)
asyncio.run(main())
Note how the initial API call is fully synchronous, and could also be performed using response = httpx.get('https://api.example.com/get_id') without requiring any await. Given we're already inside a httpx Client, that would be a bit of an anti-pattern though.
I would strongly recommend you to read through the httpx documentation, they were an excellent entry point when I was first introduced to asynchronous programming. asyncio docs for reference
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With