Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tidyjson: is there an 'exit_object()' equivalent?

Tags:

json

r

I'm using package tidyjson to parse a json string and extract the key values into columns. The json in nested, and while I can drill down at a node, I can't figure out a way to go up to the previous level. The code is below:

library(tidyjson)
library(data.table)
library(dplyr)


input <- '{
      "name": "Bob",
      "age": 30,
      "social": {
            "married": "yes",
            "kids": "no"
      },
      "work": {
            "title": "engineer",
            "salary": 5000
      } 
}'


output <- input %>% as.tbl_json() %>%
      spread_values(name = jstring("name"),
                    age = jnumber("age")) %>%
      enter_object("social") %>% 
      spread_values(married = jstring("married"),
                    kids = jstring("kids")) %>%
      #### I would need an exit_obeject() here
      enter_object("work") %>%
      spread_values(title = jstring("title"),
                    salary = jnumber("salary"))
like image 639
BogdanC Avatar asked Nov 22 '25 08:11

BogdanC


1 Answers

There's a note in the documentation:

"Note that there are often situations where there are multiple arrays or objects of differing types that exist at the same level of the JSON hierarchy. In this case, you need to use enter_object() to enter each of them in separate pipelines to create separate data.frames that can then be joined relationally."

As such I've been staging my tidyjson commands and putting the outputs together with merge, e.g.:

# first the high-level values
output_table <- input_tbl_json %>%
    spread_values(val1 = jstring('val1'),
                  val2 = jnumber('val2'))

# then enter an object and get something from inside, merging it as a new column
output_table <- merge(output_table, 
                      input_tbl_json %>%
                        enter_object('thing') %>%
                          spread_values(val3 = jstring('thing1')),
                      by = c('document.id'))

output table columns should look like | document.id | val1 | val2 | val3 |

That workflow may fall over with operations like gather_keys() that add rows, but I haven't had call to test it.

like image 115
obrl_soil Avatar answered Nov 24 '25 22:11

obrl_soil



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!