I would like to get all images in a category in Wikimedia Commons. Let's say category X, but exclude those which are also in category Y. I do not understand if I can actually do this.
https://commons.wikimedia.org/w/api.php?action=query&list=categorymembers&cmtype=file&cmtitle=Category:X
This will get all of them, but how to exclude some?
moreover I would like in the result to have the description of the images, not just the name of the file, is that possible?
MediaWiki has - by default - no built-in support for category building and querying intersections. To accomplish this task, extensions or external tools or multiple API queries and result processing is required.
On Wikimedia Commons, like on the whole Wikimedia Wiki farm, CirrusSearch powers filtered search, including search for category intersections and is also available through API (action=query&list=search&srsearch=incategory:A+-incategory:B
, this is Category:A
minus Category:B
).
One of the tools I can recommend (because it's a dedicated high-performance solution and actually running) is fastcci, developed by Daniel Schwen; specifically for Wikimedia Commons, there is already a database maintained and a webservice running but it's possible to set it up for any wiki, provided the tool set has a host to run on and has database access.
Consider the following query URL:
https://fastcci.wmflabs.org/?c1=3302993&c2=15516712&d1=0&d2=0&s=200&a=not&t=js
t=js
for a JSONP response; otherwise assumes being used as websocket)fastcciCallback( [ 'RESULT 27572680,0,0|1675043,0,0|27577015,0,0|27577043,0,0|27577106,0,0|27576896,0,0|27576790,0,0|23481936,0,0|17560964,0,0|11009066,0,0', 'OUTOF 10', 'DBAGE 378310', 'DONE'] );
RESULT
followed by a |
separated list of up to 50 integer triplets of the form pageId,depth,tag
. Each triplet stands for one image or category
Good pictures
button in any category page.
FilesOf('Category:Saaleck')
- FilesOf('Category:Rapeseed fields in Saxony-Anhalt')
GET
/w/api.php?action=query&pageids=page_IDs_separated_by_pipe
GET
/w/api.php?action=query&titles=Titles_separated_by_pipe
AFAIK, there is no way to get that directly using the API. But, assuming both categories are reasonably small, you could get all images from both of them and then compute the complement in your code.
To retrieve the description, you can use prop=imageinfo&iiprop=extmetadata&iiextmetadatafilter=ImageDescription
.
In the context of your example query, it would look like this:
https://commons.wikimedia.org/w/api.php?action=query&generator=categorymembers&gcmtype=file&gcmtitle=Category:X&prop=imageinfo&iiprop=extmetadata&iiextmetadatafilter=ImageDescription
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With