Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

wikipedia API: parse a table as JSON?

Is it possible to parse this table as array in array JSON?

The output I want is something like:

[
  ["Northwest Caucasian", "Abkhaz", "аҧсуа бызшәа, аҧсшәа", "ab", "abk", "abk", "abk", "abks"], 
  [Afro-Asiatic", "Afar", "Afaraf", "aa", "aar", "aar", "aar", "aars"],
  ...
]   

The best I've got is like this, this, or this, which isn't helpful at all.

I need not only the ISO639 table but some other wikipedia tables, so I need a general method of parsing wiki tables as json. Any ideas?

like image 747
est Avatar asked Sep 18 '25 06:09

est


2 Answers

okay, I found the simplest way is to use Javascript in Chrome Developer Console

$('table.sortable tr').map(function() {
    return new Array($('td', this).map(function() {
        return $(this).text()
    }).slice(2, 5).get())
}).get()

It's a pity wikipedia doesn't provide an API like this.

like image 53
est Avatar answered Sep 20 '25 20:09

est


@est thanks, this helped me a lot.

You can also improve previous code to save the result as json string, just wrap it in JSON.stringify() (Modern browsers only)

JSON.stringify($('table.sortable tr').map(function() {
    return new Array($('td', this).map(function() {
        return $(this).text()
    }).slice(2, 5).get())
}).get())
like image 36
darklow Avatar answered Sep 20 '25 20:09

darklow