Taking control of YAML loading — Or what happened to my ActionController::Parameters YAML
In Rails 5 ActionController::Parameters
no longer inherits from Ruby’s Hash
. This broke loading YAML serialized from older Rails versions. Let’s find out why!
First, here’s what the serialization some params looks like in Rails 5. We’ll be using YAML.dump ActionController::Parameters.new(key: :value)
throughout.
--- !ruby/object:ActionController::Parameters parameters: !ruby/hash:Hash::WithIndifferentAccess key: :value permitted: false
Here parameters
and permitted
are just instance variables from the params class.
This format isn’t the same in Rails 4.2 because the params itself was a Hash::WithIndifferentAccess
. Look:
--- !ruby/hash-with-ivars:ActionController::Parameters elements: key: :value ivars: :@permitted: false
Looks pretty similar!
The YAML parser has correctly noticed that the params is a hash subclass with ivars and used its standard format for hash subclasses.
But if we try to load the 4.2 YAML on Rails 5 it blows up. When YAML sees hash-with-ivars
it tries to revive the params as it would any other hash subclass. The YAML parser will allocate
the ActionController::Parameters
class and then use []=
to assign values. Which would be fine, except the param’s initialize
has never been called. Guess what happens here:
# actionpack/lib/action_controller/metal/strong_parameters.rb # Not actual implementation but paraphrased to make it easier to gulf down. def initialize(params = {}) @parameters = params @permitted = false end def []=(key, value) @parameters[key] = value # BOOM. If only `initialize` had been called! end
We’ll get a nice exception for that. To fix this, we will need to know more about how YAML works under the hood.
In Ruby, YAML is implemented with the Psych library that is bundled alongside Ruby. Whenever you call YAML.load
or any other method, Psych steps in and does the work.
When loading, Psych will first parse the YAML syntax into a tree of nodes it can work with. If you haven’t heard of a tree before, it’s just objects that have references to each other. There’s a root, it can have many children, and its children can have many children and so on.
Once, YAML has its tree structure, it will visit each node (the word for an object in the tree) and revive it.
The good news is YAML let’s us hook into it whenever we load the tree (same goes for dumping). We just need to give it two pieces of information.
- Which class should I use for this node?
- What’s the implementation for that?
To satisfy 1. we need to insert an element into YAML.load_tags
. Nodes are referenced with a specific tag. You’ve seen them already and !ruby/hash-with-ivars:ActionController::Parameters
is the one we need. So we tell YAML:
YAML.load_tags['!ruby/hash-with-ivars:ActionController::Parameters'] = 'ActionController::Parameters'
Then the parser will allocate
an ActionController::Parameters
when it sees that tag and will let us override the initialization routine if we implement an init_with
method.
That method passes in a coder with the tag
that we’re currently initializing and a Hash map
of what the YAML data was. For the hash-with-ivars
example the map would be
def init_with(coder) coder.tag # => '!ruby/hash-with-ivars:ActionController::Parameters' coder.map['elements'] # => { 'key' => :value } coder.map['ivars'] # => { @permitted: false } end
That gives us everything we need to replicate the missing setup from initialize
, and when done correctly ActionController::Parameters
YAML from Rails 4.x can be loaded without errors on Rails 5.
There were several complications to this, including a second format depending on the Psych version you used. To learn more, here’s the original pull request, https://github.com/rails/rails/pull/26017, and here’s all the commits for details: https://github.com/rails/rails/compare/6b44155^…70b995a
This will ship in Rails 5.0.1, coming somewhat soon!