Adrien Thebo, Puppet Labs Ops Engineer extraordinaire, started a series on his personal blog about diving into the source code of Puppet. He’s kindly agreed to cross-post the first piece to the Puppet Labs blog, in hopes of getting more collaboration on his dive into the depths.
Diving into the source of Puppet can be a complex endeavor. While Puppet and Puppet Enterprise can greatly simplify your sysadmin world, the code underneath can be overwhelming without proper instruction or background. In light of this complexity, I’ve decided that I’m going to try to blog on each module/class that I manage to decipher on my personal blog. All of this source exploration is done against 2.7.x.
As a caveat, this is what I’ve been able to derive while reading the source, and I could be wrong. If you find something erroneous, please comment or find me in #puppet on freenode (finch) and let me know.
Getting started: Puppet::Configurer
The Configurer is the heart of the normal Puppet agent. When you think about the different stages of a normal agent run, it’s all kicked off by the Configurer. It handles pluginsync, uploading facts, retrieving a catalog, applying the catalog, and then submitting the report.
The Configurer class doesn’t seem to be designed much as a general use class. From what I’ve gleaned, the expectation is that you’ll instantiate the object, call `#run` on it, and call it a day. But considering that it’s the class that drives pretty much everything, it’s definitely good to be familiar with it.
It’s also worth noting that the Configurer might eventually become obsolete. With the advent of Puppet Faces, the work that the Configurer does now can probably be replaced by assembling Faces. In fact, I believe the secret agent face does just this. It does make sense to see things moving from the monolithic, one-shot architecture used by the Configurer to behavior more akin to the secret agent face.
That being said, if you’re running `puppet agent`, then you’re using this code.
Before we get started, this code makes heavy use of the indirector. If you aren’t familiar with the indirector, you should read Masterzen’s blog post on the indirector.
(Grossly oversimplified) example:
c = Puppet::Configurer.new c.run # OMG PUPPET RUN! No, really, this is basically all you need to do a run.
This is where the magic happens. There’s a pattern that pops up in Puppet fairly frequently, where there are a number of normal methods, and one method that basically runs everything else. Nothing too unusual, it just means that there’s one point that ties together all the class logic. This method does a lot, so I’ll summarize.
- Set up reporting
The first thing we do is generate a report by adding it as a new log destination; all logged actions will end up here. We do this by creating a new Puppet::Transaction::Report object, and adding it as a log destination. This way, the report that’ll be submitted to the master will be populated in the same way that logging would be done to syslog, or to the console if you’re using `puppet agent -t`.
- Prepare storage and sync plugins
Some basic prep is done with the `#prepare` method. It sets up caching for the application. If pluginsync is turned on, `#prepare` will download our plugins – Facter facts, types, providers, etc.
After that, facts for catalog compilation are gathered with the `#facts_for_uploading` method.
- Retrieve and apply the catalog
Once we have our facts, we have everything we need to actually perform the run. The `#retrieve_and_apply_catalog` method is called with the facts we just retrieved.
- Upload the report
After we’ve applied the catalog, then the run is complete. The report generated at the beginning of the run is then sent with the `#send_report` method.
Whew, his method does a lot. Starting from the top, let’s work down through the methods that `#run` calls to see what’s done at a lower level.
This method handles two things – setting up a cache for puppet, and running pluginsync if necessary.
The first part instantiates the `Puppet::Util::Store` singleton object for the rest of the run. This way, the rest of the system can use that for caching, and not have to worry about how it gets there.
Have you ever CTRL-C’d a puppet run, re-run it, and got an error about a corrupt state file? This is where it whines, and then nukes the old statefile.
(Taken from the aforementioned code)
Puppet.err "State got corrupted"
Familiar? If some part of Puppet was writing to the statefile when Puppet was terminated, this statefile might get mangled. If this file exists and is corrupted, it’s deleted.
The other part of `#prepare` is pluginsync. It’s been entirely delegated to `Puppet::Configurer::PluginHandler`, which in turn uses `Puppet::Configurer::Downloader`. We’ll discuss this later, just know that the first thing that’s really done in a Puppet run is the pluginsync, and it’s kicked off by this method.
This is the part where we go out and grab our facts. Fact retrieval has actually been indirected, so we don’t directly go and grab the facts from Facter. Instead, the indirector is called, which defaults to Facter itself on the agent. This behavior does allow for some interesting injection of behavior, such as storing your facts in PuppetDB.
So you know the `b64_zlib_yaml` format mentioned all over the place when you’re running `puppet agent -t –debug`? It turns out that this is a custom format that’s built for handling facts. It’s YAML (a standard Puppet serialization format), that’s been compressed with zlib, that has been base 64 encoded. This compressed format was added because of some size limits on the size of the fact upload, which has since been fixed.
So we have these facts, and they might be really hefty. We attempt to use the aforementioned b64_zlib_yaml format on them, else we fall back to uncompressed yaml. After this is done, the format used to store the facts is returned, as well as the CGI escaped facts. The goal of all of this is to have our facts in a format that’s best suited to send to the master.
The logic for all of this is implemented in the `Puppet::Configurer::FactHandler` module, and it’s mixed into Configurer.
We have all our plugins, we have our facts, and we’re ready to roll. We need to run our pre-run command if it exists, apply the catalog, and then run the post-run command.
Getting a catalog is more complex than it looks, because Puppet can either fetch a new catalog, or apply an existing catalog. Once we have it, we do `catalog.apply` and we’re off to the races. After the catalog is applied, we send the report. And that’s it! That’s a Puppet run!
The logic for catalog retrieval is split into a few methods, so I’ll address them individually.
This method tries to get a catalog from *somewhere*. We’ve got the two cases mentioned above – by default get a new catalog, or reuse an existing catalog. This method delegates a lot of work to two other methods.
The default behavior implemented in this method is to do a standard REST call to the master. This REST call uploads the facts generated earlier, which the master uses to compile a new catalog. This is then downloaded and cached on the client.
If the configuration indicates that a cached catalog should be used, or if catalog retrieval fails and `:usecacheonfailure` is enabled, we’ll try to use the catalog that we cached on the last successful run. This is where catalogs cached on the client in `$vardir/yaml` come into play.
After the run has been completed, the resulting report data needs to be handled in one of a number of ways. If the `:summarize` option is turned on in Puppet, then the last run summary will be displayed to the console. A copy of the run report will be saved to `/var/lib/puppet/state`, and if reporting is turned on then a copy of that report will be sent to the master.
In summary, when you think of a typical Puppet agent run, this is where it’s done. Pluginsync is performed, facts are prepared, they’re sent to the master when the catalog is retrieved, that catalog is applied, and then the report of this all is sent to the master. This is enough of a view from 50,000 feet that you’ll be able to see how other parts fit in later.