Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Download OECD API data using Python and SDMX

Tags:

python

sdmx

I'm trying to execute the following SDMX query using Python from the OECD's database.

https://stats.oecd.org/restsdmx/sdmx.ashx/GetData/LAND_COVER_FUA/AUS+AUT.FOREST+GRSL+WETL+SHRUBL+SPARSE_VEGETATION+CROPL+URBAN+BARE+WATER.THOUSAND_SQKM+PCNT/all?startTime=1992&endTime=2019

I've looked around but can't get this to work with the existing packages and tutorials out there (I've looked at this link for example).

Would you be able to help? I can't seem to get the various libraries (pandasdmx, cif, etc.) to work with it.

Thanks so much in advance!

like image 756
user2679611 Avatar asked Dec 21 '25 06:12

user2679611


1 Answers

Using sdmx1:

import sdmx

OECD = sdmx.Client("OECD_JSON")
key = dict(
    COU="AUS AUT".split(),
    FUA=[],
    VARIABLE="FOREST GRSL WETL SHRUBL SPARSE_VEGETATION CROPL URBAN BARE WATER".split(),
    MEAS="THOUSAND_SQKM PCNT".split(),
)

# Assemble into a string
key_str = ".".join("+".join(values) for values in key.values())
print(f"{key_str = }")

# Commented: these keys are invalid
# key_str = "AUS+AUT.FOREST+GRSL+WETL+SHRUBL+SPARSE_VEGETATION+CROPL+URBAN+BARE+WATER.THOUSAND_SQKM+PCNT"
# key_str = "AUS+AUT.FOREST+GRSL+WETL+SHRUBL+SPARSE_VEGETATION+CROPL+URBAN+BARE+WATER.THOUSAND_SQKM+PCNT."

# Retrieve a data message
dm = OECD.data(
    "LAND_COVER_FUA",
    key=key_str,
    params=dict(startPeriod=1992, endPeriod=2019),
)

# Retrieve the first data set in the message
ds = dm.data[0]

# Convert to pandas
print(sdmx.to_pandas(ds))

This gives output like:

$ python q.py 
key_str = 'AUS+AUT..FOREST+GRSL+WETL+SHRUBL+SPARSE_VEGETATION+CROPL+URBAN+BARE+WATER.THOUSAND_SQKM+PCNT'

COU  FUA      VARIABLE  MEAS           TIME_PERIOD
AUS  AUS01    BARE      THOUSAND_SQKM  1992           0.000618
                                       2004           0.000618
                                       2015           0.000618
                                       2018           0.000541
                                       2019           0.000541
                                                        ...   
AUT  AT005L4  WATER     THOUSAND_SQKM  1992           0.001314
                                       2004           0.001314
                                       2015           0.001314
                                       2018           0.001314
                                       2019           0.002319
Name: value, Length: 2160, dtype: float64

A few key points to understand:

  • The URL you give (with /GetData/) indicates an SDMX 2.0 API. These are very old and few tools support them.

  • OECD provides both an SDMX-ML 2.1 and an SDMX-JSON API.

    • The package documentation above includes some notes on these.
    • This particular data flow (LAND_COVER_FUA) does not appear to be available from the "OECD" (SDMX-ML) source, only from the "OECD_JSON" one.
    • The package can only automatically construct a key from a dict when SDMX-ML is available. Since we have to use the JSON endpoint, we manually construct the key.
  • Parameters like "startTime=1992" are outdated/incorrect; the current form for SDMX 2.1 and later is "startPeriod=1992".

  • Your key only has 3 parts, but this data flow has 4 dimensions. If you uncomment the first line above that sets key_str explicitly, the web service responds:

    Semantic Error - Wrong number of non-time dimensions provided. Expected 4. Query contains 3

    If we randomly guess and put an extra period at the end (second commented key_str), we get the error:

    Semantic Error - Dimension 'FUA' does not contain code(s) 'FOREST,GRSL,WETL,SHRUBL,SPARSE_VEGETATION,CROPL,URBAN,BARE,WATER'

    This indicates that (a) the missing dimension is the second one and (b) it has an id "FUA". So we insert this in key. Notice that the resulting, valid key string contains +AUT..FOREST+. These two consecutive periods (..) indicate "No specific labels/return all labels for this dimension."

like image 67
Paul Natsuo Kishimoto Avatar answered Dec 23 '25 19:12

Paul Natsuo Kishimoto