I am trying to load a list of words from a YAML file. In the file there is an entry
- on
Ruby is loading this as "true", instead of "on". Similarly "off" is loaded as "false". A quick check on the Psych code shows "yes" and "no" are treated the same way.
Is there any way I can change this behaviour, other than adding quotes around on and off?
I am able to see the values if I read the file and parse, instead of load_file.
# test.yaml
- true
- false
- yes
- no
- on
- off
- y
- n
- Y
- N
I get a Psych document by parsing instead of loading, which has the text before transformation to native.
YAML.parse_file('test.yaml')
Wondering how to extract it correctly.
From the docs
"The representation stage means data which has been composed into YAML::BaseNode objects. In this stage, the document is available as a tree of node objects. You can perform YPath queries and transformations at this level. (See YAML::parse.)"
Require help on writing a comprehensive YPath query to extract data.
(PS: This may seem a bit roundabout, but that cleans up a lot of things in data management for me)
As already explained in other answers, on
is considered a "truthy" value. This behavior is intentionally coded in Psych.
The best solution to the problem, as explained by Arup Rakshit and Mikhail P, is to quote the value. However, given that your question asks for an alternative, here's an alternative.
Scalar conversion in Psych is hard-coded in Psych::ScalarScanner#tokenize
. A possible (but strongly discouraged) option is to monkey patch this method to change this case statement
when /^(yes|true|on)$/i
true
when /^(no|false|off)$/i
false
As you probably realize looking at the source code, the method is quite long and the monkey patch will force you to copy/paste a quite big chunk of code. There's no easy way, the options are hard-coded into the select case (one more sign that this is not a good idea).
Personally, I would never go that way. Modifying the core behavior of Psych may lead to several unexpected side effects, since other libraries may depend on this behavior.
Another option, if you don't want to modify the original file physically, is to write a proxy that changes it at runtime.
In practice, you can create a CustomYaml parser, that implements a parse_file
method. The method will read the content of the file in memory, perform a "search & replace" of any occurrence of unescaped on
into "on"
, then feel YAML.load()
.
This will cheat the YAML
parsing, causing it to interpret each "on" token as scalar string.
Similar to this pre-processing approach, you can adopt a post-processing approach by traversing the YAML AST returned by Psych.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With