Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract JSON key-value map with a regex from a file

Tags:

java

json

regex

I'm not really strong in regexes and tried to solve a problem with a parser written by hands, my own bicycle, which failed on some inputs unpredicted by me though, the problem is as follows: I have JavaScript i18n files that, along with translations, contain some other configuration stuff that may be defined somewhere in a file (that's the main reason why it's pretty hard to handle the problem with hand made parser), so the file is something like that:

(function() {
    'use strict';
    //some configuration stuff (some other stuff may be insterted)
    var translations = angular.module('module.translations.languages.enUs', []);

    translations.constant('translationsName', {
     "first_label":"first_label_value",
     "second_label":"second_label_value"
     //etc
});

}());

The example above is only one of possible template options, but they all have one thing in common - translation labels are defined as a key-value json which is nothing but a java map serialized to json. My goal is to get only these key-value json from a file, deserialize it to map, do some operations with it and insert it back again. So the question is: perhaps someone has ready and proven regex that could handle this kind of situation - find a map of key-value json in a text? If so, I would be really grateful for that! Thanks, Cheers, Andrey

like image 700
Andrey Yaskulsky Avatar asked Oct 15 '25 21:10

Andrey Yaskulsky


1 Answers

You could use this regexp to find "key":"value" pairs :

"([^"]+)"\s*:\s*"([^"]+)",?

Group 1 is the key, Group 2 is the value

It will also find "key": "value", "key" :"value" or "key" : "value" pairs.

Demo on regexplanet (click the Java button then click Test button)

Also a demo on regex101

Explanation

"([^"]+)" : Capture any character but a double-quote between double-quotes (this is the key)

\s*? : Followed by zero or more whitespace

\s* : Followed by a colon

\s* : Followed by zero or more whitespace

"([^"]+)" : Capture any character but a double-quote between double-quotes (this is the value)

like image 138
Stephane Janicaud Avatar answered Oct 18 '25 10:10

Stephane Janicaud