Israel has released it's budget for all to see, and there is an API to extract the data. However, I don't know how to parse it into a txt/csv format.
Here is an example link to make a call for data.
Here is the output:
[
{
"parent": [
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 6075053,
"year": 2003,
"title": "השכלה גבוהה",
"gross_amount_used": 5942975,
"gross_amount_revised": 5942975,
"budget_id": "0021",
"net_amount_used": 5936491,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 5861591,
"gross_amount_allocated": 5861591
},
{
"parent": [
{
"budget_id": "0021",
"title": "השכלה גבוהה"
},
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 5364976,
"year": 2003,
"title": "השתתפות בתקציב המוסדות להשכלה גבוהה",
"gross_amount_used": 5337585,
"gross_amount_revised": 5337584,
"budget_id": "002102",
"net_amount_used": 5331101,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 4985915,
"gross_amount_allocated": 4985915
},
{
"parent": [
{
"budget_id": "0021",
"title": "השכלה גבוהה"
},
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 565495,
"year": 2003,
"title": "השתתפות בפעולות",
"gross_amount_used": 462490,
"gross_amount_revised": 462490,
"budget_id": "002103",
"net_amount_used": 462490,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 559293,
"gross_amount_allocated": 559293
},
{
"parent": [
{
"budget_id": "0021",
"title": "השכלה גבוהה"
},
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 0,
"year": 2003,
"title": "רזרבה להתייקרויות",
"gross_amount_used": 0,
"gross_amount_revised": null,
"budget_id": "002105",
"net_amount_used": null,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 171801,
"gross_amount_allocated": 171801
},
{
"parent": [
{
"budget_id": "0021",
"title": "השכלה גבוהה"
},
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 108000,
"year": 2003,
"title": "פיתוח מוסדות להשכלה גבוהה",
"gross_amount_used": 108000,
"gross_amount_revised": 108000,
"budget_id": "002106",
"net_amount_used": 108000,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 108000,
"gross_amount_allocated": 108000
},
{
"parent": [
{
"budget_id": "0021",
"title": "השכלה גבוהה"
},
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 23634,
"year": 2003,
"title": "תחום פעולה כללי",
"gross_amount_used": 23634,
"gross_amount_revised": 23634,
"budget_id": "002101",
"net_amount_used": 23634,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 23634,
"gross_amount_allocated": 23634
},
{
"parent": [
{
"budget_id": "0021",
"title": "השכלה גבוהה"
},
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 12948,
"year": 2003,
"title": "פעולות עם משרדים ומוסדות אחרים",
"gross_amount_used": 11266,
"gross_amount_revised": 11266,
"budget_id": "002104",
"net_amount_used": 11266,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 12948,
"gross_amount_allocated": 12948
}
]
What would be the way to parse this into a table format?
Thanks!
Tal
If you install the rjson package You should be able to do:
do.call( 'rbind', fromJSON( file="http://budget.yeda.us/0021?year=2003&depth=1" ) )
[edit]
Actually.. that has problems with the variable length internal parent list, but should get you half way there
Yep, it's JSON. fromJSON will turn it into a list for you
resp <- getURL("http://budget.yeda.us/0021?year=2003&depth=1")
library(rjson)
resp <- fromJSON(resp)
That gets you to list form. For a data frame, try:
library(plyr)
resp <- llply(resp, function(x) llply(x, function(y) ifelse(is.null(y), "NULL", y)))
budget <- data.frame()
for(i in 1:length(resp)) {
budget <- rbind.fill(budget, data.frame(resp[[i]]))
}
The nested llply take care of some unpleasantness when creating data frames that include null values.
Looks like JSON. Try the rjson package, but some looping or tricky listy fiddling may be required.
Lunchtime now, otherwise I'd have a solution pasted in. Give the non-lunching portion of the hive mind a few minutes....
This is similar to a previous answer but also creates columns for the second budget_id and title fields in parent, not just the first and is structured slightly differently operating on the parent and the rest separately and then putting them back together.
library(rjson)
library(plyr)
js <- fromJSON(file = "http://budget.yeda.us/0021?year=2003&depth=1")
toDF <- function(x) do.call("rbind.fill", lapply(x, as.data.frame))
Null2NA <- function(x) if (is.null(x)) NA else x
parent1 <- lapply(js, "[[", "parent")
rest <- lapply(js, function(x) lapply(x[-1], Null2NA))
cbind(toDF(parent1), toDF(rest))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With