Community

Welcome to the community homepage! We depend upon our community to engage with one another and evolve the project. Here are a number of ways to get involved:

  1. Join a list
  2. Chat with us
  3. Contribute

Latest Blog Posts

Managing F5 BIG-IP Network Devices with Puppet

Posted on
By
Nan Liu
in
Blog, Community, DevOps, Extending Puppet, How to, Modules, Open Source, Puppet Enterprise, Services, Tips
Responses
1 Comment »

Management of network devices is one of the exciting new features in Puppet Enterprise 2.0 and Puppet 2.7. In the initial release, support is limited to Cisco devices, but because Puppet is extensible via modules, we are able to build upon the existing framework and add support for F5 BIG-IP. Like most network appliances, installation of third party software is prohibited, which eliminates the ability to run an agent. Instead, Puppet takes advantage of F5 iControl API to interact and manage the device. F5 BIG-IP network appliances are capable of load balancing, SSL offloading, application monitoring, as well as many other advanced features, and now Puppet can manage these functionalities. Compared to traditional management methodologies and other third party tools that interact with F5, Puppet not only bridges the gap from deploying applications to bringing the service online to your customers, it also brings the unique benefit of the Puppet resource model to network devices. More specifically, the integration offers the ability to compare if a running configuration matches the desired configuration, and then enforce the changes once they’ve been reviewed.

In this blog post, we will step through the process of installing the F5 module, configuring connectivity, and writing a simple manifest to manage an F5 device with Puppet. If you are unfamiliar with BIG-IP devices, you may want to consult devcentral.f5.com for more information on F5 features such as iRules and iControl API.

In the following output, Puppet detects that the F5 device has the wrong iRule. We are running in simulation mode (--noop option) so Puppet only shows the changes that would be applied if we were enforcing the configuration.

$ puppet device --noop
...
notice: /Stage[main]//F5_rule[redirect_404]/definition: current_value: when HTTP_RESPONSE {
if { [HTTP::status] eq "404" } {
redirect to "http://www.puppetlabs.com/404/"
}
}, should be when HTTP_RESPONSE {
if { [HTTP::status] eq "404" } {
redirect to "http://www.puppetlabs.com/redirect/404/"
}
} (noop)
notice: Finished catalog run in 5.69 seconds

Now running Puppet with out the --noop option set, the iRules are changed to match our resource declaration.

$ puppet device
…
notice: /Stage[main]//F5_rule[redirect_404]/definition: definition changed when HTTP_RESPONSE {
if { [HTTP::status] eq "404" } {
redirect to "http://www.puppetlabs.com/404/"
}
}, to when HTTP_RESPONSE {
if { [HTTP::status] eq "404" } {
redirect to "http://www.puppetlabs.com/redirect/404/"
}
}
notice: Finished catalog run in 5.74 seconds

In addition to iRules, the initial release supports key features to configure the device certificate, manage applications pool/poolmember/virtualserver, and monitor application health. The module is published on the Puppet Forge, along with comprehensive documentation of supported F5 resources. The latest development release is available on GitHub. The ability to extend Puppet is not limited to network devices, and the commands shown later in this post to install the module is applicable for other Puppet modules available on the Puppet Forge and GitHub.

The puppet device command is a new application mode in Puppet intended to manage devices that can’t install Ruby/Puppet and run puppet agent. If you haven’t checked it out yet, I would review Brice’s introduction to network devices first. The high level overview of the entire communication process:

Devices are managed through an intermediate proxy system where Puppet agent is installed. The proxy system stores a certificate on behalf of the device. In the case of F5, it should have iControl gem installed, as well as the account information in device.conf to communicate with the device. The proxy connects to the Puppet master to retrieve the catalog on behalf of the F5 and applies changes as necessary.

F5 module installation

Before we get started with the installation process, there are two ways to install F5 module. First, via puppet-module tool which retrieves it from forge.puppetlabs.com (for stable releases of the module). The latest development release from GitHub is accessible via git. The instructions below are specific for Puppet Enterprise, but open source users can also install the module with some changes to the puppet module path. Puppet Enterprise currently ships with the puppet-module gem, and it’s freely available on rubygems.org. Eventually this will become a Puppet Face and turn into the command ‘puppet module’(expected in later release of 2.7). Onwards to the install process:

# Puppet Enterprise:
cd /etc/puppetlabs/puppet/modules
puppet-module install puppetlabs-f5

This should create a directory called f5. Older versions of puppet-module tool might create a puppetlabs-f5 directory in the modules directory—in that case, change it to f5.

Installing from GitHub:

# Puppet Enterprise:
cd /etc/puppetlabs/puppet/modules
git clone git@github.com:puppetlabs/puppetlabs-f5.git
ln -s puppetlabs-f5 f5

In Puppet 2.7, the transport between proxy agent and device supports telnet/ssh, however neither is suitable for F5 devices. Instead, we rely on F5’s iControl API. The iControl gem should be installed on both the master and the proxy system. This gem is available in the F5 module files directory.

# Puppet Enterprise:
/opt/puppet/bin/gem install /etc/puppetlabs/puppet/modules/f5/files/f5-icontrol-10.2.0.2.gem

Configuration and management

At this point we have the module installed, so we should configure connectivity for the device. The configuration for network devices by default are stored in /etc/puppet/device.conf:

[f5.puppetlabs.lan]
type f5
url https://username:password@f5.puppetlabs.lan/partition
[f5.dev.puppetlabs.lan]
type f5
url https://username:password@f5.dev.puppetlabs.lan/partition

You can also break down each device into it’s own configuration file such as /etc/puppet/f5_device1.conf, /etc/puppet/f5_device2.conf …, which is especially helpful if you wish to run against each device separately. In the square brackets is the device certificate name, and certificate management process is the same as puppet agent certs. The device type is f5, and the url is https instead of telnet/ssh. Because F5 supports different partitions we can optionally specify them at the end, and it will default to the ‘Common’ partition if it’s not provided. In the module, f5::config define resource type simplifies management of this configuration file on the proxy system:

f5::config { 'f5.puppetlabs.lan':
    username => 'admin',
    password => 'password',
    url      => 'f5.puppetlabs.lan',
    target   => '/etc/puppetlabs/puppet/device/f5.puppetlabs.lan.conf',
}

Once this configuration file is in place, we can initiate a puppet device run on the proxy server.

# execute on proxy server
$ puppet device --deviceconf /etc/puppetlabs/puppet/device/f5.puppetlabs.lan.conf

This should generate a certificate request on the master which should be signed:

# execute on puppet master
$ puppet cert -l
f5.puppetlabs.lan (2A:0C:A0:F8:C6:EE:EF:9B:B3:49:74:D1:27:31:1B:60)
$ puppet cert -s f5.puppetlabs.lan

At this point the master should have a node name f5.puppetlabs.lan in site.pp with the appropriate f5 resources:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
node f5.puppetlabs.lan {
  f5_rule { 'redirect_404':
    ensure     => 'present',
    definition => 'when HTTP_RESPONSE {
if { [HTTP::status] eq "404" } {
redirect to "http://www.puppetlabs.com/redirect/404"
}
}',
  }
 
  f5_pool { 'webapp':
    ensure                          => 'present',
    action_on_service_down          => 'SERVICE_DOWN_ACTION_NONE',
    allow_nat_state                 => 'STATE_ENABLED',
    allow_snat_state                => 'STATE_ENABLED',
    lb_method                       => 'LB_METHOD_ROUND_ROBIN',
    member                          => {
      '10.10.0.1:80' => {'connection_limit' => '0',
                         'dynamic_ratio'    => '1',
                         'priority'         => '0',
                         'ratio'            => '1'},
      '10.10.0.2:80' => {'connection_limit' => '0', 
                         'dynamic_ratio'    => '1', 
                         'priority'         => '0',
                         'ratio'            => '1'},
      '10.10.0.3:80' => {'connection_limit' => '0',
                         'dynamic_ratio'    => '1',
                         'priority'         => '0',
                         'ratio'            => '1'}
    },
    minimum_active_member           => '1',
    minimum_up_member               => '0',
  }
}

When the puppet device command is executed again this will update the iRule and ensure the appropriate members are in the webapp pool:

$ puppet device --deviceconf /etc/puppetlabs/puppet/device/f5.puppetlabs.lan.conf

A limitation to watch out for in the current Puppet release is that ‘puppet apply’ and ‘puppet resource’ cannot modify network resources. However, we implemented a feature to allow puppet resource to query a F5 device. (For authors of types/providers, making changes to resources aren’t supported until apply_to_device in resources type are handled differently by puppet apply/resource commands). For now we use url facts to establish connectivity to specific F5 devices:

export RUBYLIB=/etc/puppetlabs/puppet/modules/f5/lib/
export FACTER_url=https://admin:password@f5.puppetlabs.lan/Common
puppet resource f5_rule
f5_rule { '_sys_https_redirect':
  ensure     => 'present',
  definition => '    when HTTP_REQUEST {
set host [HTTP::host]
HTTP::respond 302 Location "https://$host/"
}',
}
f5_rule { '_sys_auth_ssl_cc_ldap':
  ensure     => 'present',
  definition => '    when CLIENT_ACCEPTED {
set tmm_auth_ssl_cc_ldap_sid 0
set tmm_auth_ssl_cc_ldap_done 0
}
when CLIENTSSL_CLIENTCERT {
…

If you don’t have Puppet Enterprise 2.0 in your environment yet, you can either download the Learning Puppet VM, or Puppet Enterprise installation packages. F5 also provides F5 LTM Virtual Edition (VE) for trial on VMWare. Please report any issues or bugs to http://projects.puppetlabs.com/projects/modules/issues under the modules section.

Additional Resources

First Look: Installing and Using Hiera (part 1 of 2)

Posted on
By
Hunter Haugen
in
Blog, Community, DevOps, Extending Puppet, Hiera, How to, Tips
Responses
1 Comment »

In a previous blog post, we introduced use cases for separating configuration data from Puppet code. This post (part one of a two part series) will go in-depth with installing, configuring, and using Hiera, but let’s first look at WHY we would need Hiera.

Introduction to the SSH module

One of the benefits of Hiera is its ability to take an existing module and adapt it to a hierarchical-based lookup system. Typically, one of the first modules that people adapt to Puppet code is the SSH module. Let’s look at a simple ssh class definition:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class ssh {
  $ssh_packages      = ['openssh','openssh-clients','openssh-server']
  $permit_root_login = 'no'
  $ssh_users         = ['root','jeff','gary','hunter']
 
  package { $ssh_packages:
    ensure => present,
    before => File['/etc/ssh/sshd_config'],
  }
 
  file { '/etc/ssh/sshd_config':
    ensure  => present,
    owner   => 'root',
    group   => 'root',
    mode    => '0644',
    # Template uses $permit_root_login and $ssh_users
    content => template('ssh/sshd_config.erb'),
  }
 
  service { 'sshd':
    ensure     => running,
    enable     => true,
    hasstatus  => true,
    hasrestart => true,
  }
}

The template used above looks like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Protocol 2
SyslogFacility AUTHPRIV
PasswordAuthentication yes
ChallengeResponseAuthentication no
GSSAPIAuthentication yes
GSSAPICleanupCredentials yes
 
# PermitRootLogin Setting
PermitRootLogin <%= permit_root_login %>
 
# Allow individual Users
<% ssh_users.each do |user| -%>
AllowUser <%= user %>
<% end -%>
 
# Accept locale-related environment variables
AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
AcceptEnv LC_IDENTIFICATION LC_ALL
X11Forwarding yes
Subsystem	sftp	/usr/libexec/openssh/sftp-server

This module declares three packages (openssh, openssh-clients, openssh-server), ensures a proper sshd_config file, and starts the sshd service. While this works fine for RedHat distributions, there will be a problem with this module if we try and use it on other Linux variants (such as Debian or Ubuntu). Normally, logic is introduced into the module that decides which package names to use based on the operating system of the node. Instead of doing that, let’s use Hiera to solve our problem by changing three lines:

$ssh_packages      = hiera('ssh_packages')
$permit_root_login = hiera('permit_root_login')
$ssh_users         = hiera('ssh_users')

Instead of providing a simple array, we’re now going to utilize Hiera and do a data lookup for the packages to declare in our module, the users to permit, and the permit_root_login parameter that will be used in the sshd_config file. An array will still be returned by Hiera for the $ssh_packages and $ssh_users variables, but the elements in that array will change depending on the operating system of the node. Before we can do this, though, we need to setup Hiera, its hierarchy, and the data directory that it will use for parameter lookups.

Install Hiera

As of this writing, Hiera is not installed with Puppet or Puppet Enterprise and must be installed using RubyGems—though it will be included in the next version of Puppet. Hiera has two separate gems: hiera and hiera-puppet. The hiera gem contains the hiera library source code, the default YAML backend, and the hiera binary that can be used to execute lookups from the command line. The hiera-puppet gem contains the custom functions necessary to call Hiera from Puppet. To install these libraries, do the following:

1
gem install hiera hiera-puppet

(Note that if you’re running Puppet Enterprise, you will need to use the gem binary that’s located in /opt/puppet/bin)

The last step that’s necessary is to get the custom Hiera functions that Puppet needs to do a parameter lookup loaded into Puppet itself. These functions come bundled with the hiera-puppet gem, but they currently are placed into your system’s $GEMPATH and are not loaded by Puppet. To remedy this, let’s download a copy of hiera-puppet from source and place it in our Puppet Master’s modulepath so it can make the functions available from within Puppet.

    1. Get your Puppet Master’s module path by entering puppet master --configprint modulepath
    2. Change to the modulepath directory that was output from the previous step
    3. Enter the following command to download a tarball of the hiera-puppet source code, create a directory called ‘hiera-puppet’, expand the contents of the tarball to the ‘hiera-puppet’ directory, and remove the ‘hiera-puppet’ tarball:


curl -L https://github.com/puppetlabs/hiera-puppet/tarball/master -o \
'hiera-puppet.tar.gz' && mkdir hiera-puppet && tar -xzf hiera-puppet.tar.gz \
-C hiera-puppet --strip-components 1 && rm hiera-puppet.tar.gz

Now the custom Hiera functions are available to be used by the Puppet Master. Let’s move on to configuring Hiera.

Configuring Hiera with YAML & hiera.yaml

Hiera is configured through the /etc/puppetlabs/puppet/hiera.yaml configuration file. This file is written in the markup language called YAML which is simple, human-readable, and is widely supported by scripting languages. (You can read more about YAML here.)

The hiera.yaml configuration file is what Hiera uses to determine the order of its lookup, and the location of the data directory where the YAML files are located. Lets look at an example hiera.yaml configuration file that we can drop into place for our ssh module and break it down piece by piece:

1
2
3
4
5
6
7
8
---
:hierarchy:
    - %{operatingsystem}
    - common
:backends:
    - yaml
:yaml:
    :datadir: '/etc/puppetlabs/puppet/hieradata'

We see that our chosen backend is YAML, and that our data will be stored in /etc/puppetlabs/puppet/hieradata instead of embedding it in our modules. This is looking promising!

The last, and also the most important, piece is the hierarchy itself. We’ve chosen to have two levels: a common level that is common to all hosts, and a higher-priority level that contains any operating-system-specific data.

When we query a hiera() function in Puppet, Hiera looks in its hiera.yaml configuration file for backends to query, and for the directory where the backend data is kept. Lets look at how we might add configuration data to Hiera’s datadir.

Introduction to the YAML data backend

The YAML data backend is the quickest Hiera backend to begin using, and is included with Hiera. YAML is an extremely readable data serialization format, so it makes sense to utilize it if you don’t have a specific need for another format. In the hiera.yaml configuration file above, we created a hierarchy of two levels: %{operatingsystem} and common. Assuming that we are configuring a RedHat system, Hiera will look in the datadir directory for two files in this order: RedHat.yaml and common.yaml. Why? The highest level in the hierarchy queries Facter for the operatingsystem fact (which, in this case, returns ‘RedHat’), and then searches for a YAML file of that name. The second level is just the string common, so it looks for a file called ‘common.yaml’. Let’s take a look at those files:

RedHat.yaml

1
2
3
4
---
ssh_packages: - 'openssh'
              - 'openssh-clients'
              - 'openssh-server'

common.yaml

1
2
3
4
5
6
---
permit_root_login : 'no'
ssh_users         : - root
                    - jeff
                    - gary
                    - hunter

With the hiera.yaml configuration file setup and our Hiera data directory containing YAML files, we can actually begin performing lookups and inspecting the resultant data.

Hiera data lookups

Using our RedHat node and the current Hiera setup, what would be the value of $permit_root_login in this line from our ssh Puppet manifest:

$permit_root_login = hiera('permit_root_login')

The answer is ‘no’. How did we get that? Hiera performed a lookup for ‘permit_root_login’ and searched the highest priority file in the hierarchy – RedHat.yaml (based on the node’s ‘operatingsystem‘ fact being the string ‘RedHat‘). Hiera didn’t find the parameter in that file so it moved to the next, and final, level of the hierarchy and searched common.yaml. Because the parameter is defined in common.yaml, it returned the value back to Puppet.

What if we wanted all RedHat nodes to set the value of $permit_root_login to be ‘without-password’? Using Hiera, we would modify the RedHat.yaml file and add the following line:

permit_root_login : 'without-password'

Because the RedHat.yaml file is queried BEFORE the common.yaml file, RedHat nodes would get this value, while all other nodes would get the value of ‘no’ from common.yaml. Taking this example one step further, what if we wanted all Debian nodes to have the value of $permit_root_login set to ‘yes’? We would need to create a file called Debian.yaml, place it in the Hiera data directory, and enter the following:

1
2
---
permit_root_login : 'yes'

Now, when a Debian node contacted Puppet, Hiera would query the Debian.yaml file BEFORE common.yaml, and the value of $permit_root_login would get the value set in Debian.yaml (which, in this case, would be ‘yes’).

This logic could be repeated over and over for any parameter and with as many hierarchy levels as you desire.

Beyond Basic Lookups: Concatenating Values With Hiera

By default, Hiera uses a priority lookup—which means that the first time it encounters a parameter in the hierarchy it accepts that value and returns it to Puppet. This is how higher levels in the hierarchy can override values that might be set in lower levels of the hierarchy. What if you wanted to search through ALL levels of the hierarchy and return EVERY value for a specific parameter? Hiera has that ability with the hiera_hash() and hiera_array() functions.

There are two variables that currently return arrays: $ssh_packages and $ssh_users. Right now, the variables are being set with a priority lookup—so the ENTIRE contents of the array is being set when Hiera first encounters the ‘ssh_users’ and ‘ssh_packages’ parameter in its lookup. What if we wanted this value to always contain the root user, but other users should change depending on what operating system a node was using? The best way to do this would be to use the hiera_array() function that searches ALL hierarchy levels and returns an array containing the value of ssh_users from EVERY hierarchy level in which it encountered the parameter. Let’s modify our Hiera YAML files to reflect this change:

common.yaml

1
2
3
---
permit_root_login : 'no'
ssh_users         : - root

RedHat.yaml

1
2
3
4
5
6
---
ssh_packages: - 'openssh'
              - 'openssh-clients'
              - 'openssh-server'
ssh_users   : - 'gary'
              - 'jeff'

Debian.yaml

1
2
3
---
permit_root_login : 'yes'
ssh_users         : - 'hunter'

Finally, modify the following line in the ssh module:

$ssh_users         = hiera_array('ssh_users')

After making the changes, which users will be added to /etc/ssh/sshd_config file on a RedHat node? The answer is root, gary, and jeff. Why? The root user will ALWAYS be included in /etc/ssh/sshd_config because the common.yaml file that EVERY node evaluates contains the value of ‘root’ for the ssh_users parameter. Next, because this is a RedHat node, Hiera will concatenate the values of ‘gary’ and ‘jeff’ to the array because those are the values for the ssh_users parameter in RedHat.yaml. What if we run this on a Debian node? The answer is root and hunter (because the value of the ssh_users parameter in the Debian.yaml file is ‘hunter’).

Hiera Best Practices

Hiera is still new to many people, and the concept of a hierarchical lookup system can seem a bit foreign initially. Because of this, there are a couple of best practices that are important to observe when getting started with Hiera and Puppet.

Keep hierarchies to a minimum

This is the time-proven rule of “Just because you can, doesn’t mean you should.” Hierarchy levels are incredibly dynamic tools that will allow you to do a number of things that were previously difficult, but too many of them can lead to problems when debugging (i.e. “Where was that parameter set, again?”). Three to four hierarchy levels should be enough for most sites; if you have more than that, you might want to re-think your approach.

Version control your Hiera data directory separately from your Puppet repository

The benefit of the :datadir: parameter in hiera.yaml is that you can use Facter fact values to determine the path of your Hiera data directory. For example, a site using two Puppet environments called ‘development’ and ‘production’ that has implemented the ssh module we outlined above might have the following directory tree at /etc/puppetlabs/puppet/environments

environments/
    |-- development
    |   |-- hieradata
    |   |   |-- Debian.yaml
    |   |   |-- RedHat.yaml
    |   |   `-- common.yaml
    |   |-- manifests
    |   |   `-- site.pp
    |   `-- modules
    |       `-- ssh
    `-- production
        |-- hieradata
        |   |-- Debian.yaml
        |   |-- RedHat.yaml
        |   `-- common.yaml
        |-- manifests
        |   `-- site.pp
        `-- modules
            `-- ssh

This site’s hiera.yaml configuration file would look like the following:

1
2
3
4
5
6
7
8
---
:hierarchy:
    - %{operatingsystem}
    - common
:backends:
    - yaml
:yaml:
    :datadir: '/etc/puppetlabs/puppet/environments/%{environment}/hieradata'

Hiera automatically substitues the value of the current environment for %{environment} in hiera.yaml and allows for a Hiera data directory that’s completely separate from Puppet manifests/modules.

What now?

This post serves as an introduction to using Hiera with Puppet and familiarizes you with the concepts of hierarchical lookup systems, priority lookups, multilevel lookups, and data separation. The concepts in this post will walk you through getting a working Hiera setup, but there is much more that can be done (Hiera as an ENC, custom backends, etc…). The next post in this series will introduce these advanced Hiera concepts and much more. Until then, enjoy experimenting with Hiera!

Additional Resources

The Problem with Separating Data from Puppet Code

Posted on
By
Gary Larizza
in
Blog, Extending Puppet, How to, Systems Management
Responses
9 Comments »

You’ve bought Pro Puppet, downloaded a couple of modules from the Puppet Forge (and have written some of your own too), and you’re on your way to implementing your Puppet environment when it hits you: something feels bulky with the way you’ve designed your Puppet code. Your modules may not be portable between environments (development, testing, production) without significant tweaks, each of your node declarations may require a number of variables in order for the code to work, or you’re constantly needing to open up your modules to account for changes in your environment.

There’s GOT to be an easier way to do this, right?

We hear stories from many customers about problems in their Puppet environments, and many of them can be traced back to the way their configuration data is integrated with their Puppet code. Configuration data is the term we use for the environment-specific data that needs to be plugged in to your Puppet code (i.e. variables, class parameters). Take the following bit of Puppet code for example:

1
2
3
4
5
6
7
$dnsserver    = '8.8.8.8'
$searchdomain = 'puppetlabs.vm'
 
file { '/etc/resolv.conf':
  ensure  => present
  content => "search ${searchdomain}\n nameserver ${dnsserver}\n",
}

The configuration data in this example would be the hard-coded variables $dnsserver and $searchdomain and the Puppet code would be the file resource block declaring /etc/resolv.conf. This example is intentionally kept simple in order to highlight the methods by which you will separate your configuration data from your Puppet code, but imagine code that needs to set different variables in different environments (MySQL servers, databases, usernames, and passwords, for example) and you can see how the above example can quickly become unwieldy. How else can this be done?

Legacy Method – Node Inheritance

The first method that people usually tried was node inheritance. By defining variables in separate node definition blocks, and inheriting from a nested list of definitions, you could SIMULATE data separation with this method. This was the go-to method before Puppet 2.6 was released, and as such we consider it to be a legacy solution that we don’t recommend using with versions of Puppet newer than 0.25 (note that if you’re still using node inheritance, please read this advisory on dynamic scoping).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
node common {
  $dnsserver    = '8.8.8.8'
  $searchdomain = 'puppetlabs.vm'
}
 
node production inherits common {
  $dnsserver = '10.13.1.3'
}
 
node 'agent.puppetlabs.vm' inherits production {
  file { '/etc/resolv.conf':
    content => "search ${searchdomain}\n nameserver ${dnsserver}\n",
  }
}

PROS

  • It was the easiest method to employ.
  • Your data was in one location and, technically, separate from your modules.

CONS

  • There was no easy way to find the value of a variable for a specific node.
  • FINDING the value of a variable required “human parsing,” or reading through each and every node declaration to trace variable values.
  • The data still resided in your Puppet code repository.
  • There are better ways to implement this strategy, and this should be considered a legacy solution provided solely for information purposes.

Parameterized Classes

Puppet version 2.6 gave us the ability to pass parameters with class declarations. This allows you to completely remove configuration data from your classes and provide ‘sane’ default values should a class declaration not pass a parameter. While this is an entry-level step in beginning to separate your configuration data from your Puppet code (the data is now in its own class—in this case dns::params), the configuration data is STILL in your Puppet code repository (and thus isn’t a full separation). See below for an example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class dns::params {
  $dnsserver    = '8.8.8.8'
  $searchdomain = 'puppetlabs.vm'
}
 
class dns(
  $dnsserver    = $dns::params::dnsserver,
  $searchdomain = $dns::params::searchdomain
) inherits dns::params {
 
  file { '/etc/resolv.conf':
    content => "search ${searchdomain}\n nameserver ${dnsserver}\n",
  }
}

PROS

  • Class parameters can be defaulted back to a ‘sane’ value as outlined in our Smart Parameter Defaults document.
  • Modules that utilize this methodology are more portable—parameters need only be changed in a single ‘params’ class.

CONS

  • All logic must be embedded in each module’s ‘params’ class.
  • If you use this methodology to keep your configuration data separate, every module must have a ‘params’ class and any logic you introduce (picking different values based on operating system, for example) must be repeated in every module.
  • The data isn’t truly separate from your Puppet code as it still resides INSIDE the module (and, technically, your Puppet code repository).

External Node Classifier

Many large sites decide to use an External Node Classifier script to solve the problem of looking up configuration data. External Node Classifiers (also known as ENCs) allow you to provide class declarations, parameters, and variables to Puppet in the form of YAML. The previous example would look like this in YAML:

1
2
3
4
5
classes:
  - dns
parameters:
  searchdomain :  ‘puppetlabs.vm’
  dnsserver    :  ‘8.8.8.8’

PROS

  • Flexible – you design how the information lookup is done (query a database, parse a hostname or other Facter fact, etc).
  • Can be written in any language: shell, perl, ruby, python, etc…
  • Plugs into your existing CMDB (Configuration Management Database) to retrieve information that already exists in another source of truth

CONS

  • You are responsible for writing and maintaining the External Node Classifier Script
  • If the script breaks, your Puppet runs are endangered

Extlookup

Extlookup was introduced in Puppet version 2.6.0 as a hierarchical way to lookup values of parameters or variables based on a Facter fact value. To use Extlookup, you would first define a data directory that Extlookup would search based on a specific fact value (location, environment, operatingsystem, etc), and then you would specify a lookup precedence (look for a parameter/variable in a file named after the node’s certname FIRST, and then search in a file named after the node’s environment SECOND, and so on). Finally, you would assign a parameter/variable’s value by invoking Extlookup with the built-in (as of Puppet version 2.6.0) ‘extlookup()’ function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$extlookup_datadir    = "/etc/puppetlabs/puppet/data"
$extlookup_precedence = [$environment, 'common']
 
node 'agent.puppetlabs.vm' {
  include dns
}
 
class dns {
  $dnsserver    = extlookup('dnsserver')
  $searchdomain = extlookup('searchdomain')
 
  file { '/etc/resolv.conf':
    content => "search ${searchdomain}\n nameserver ${dnsserver}\n",
  }
}

Sample common.csv file used with Extlookup

1
2
dnsserver, '8.8.8.8'
searchdomain, 'puppetlabs.vm'

PROS

  • Extlookup supports a dynamic and hierarchical lookup based on a node’s Facter fact values.
  • There could be a single node declaration that would use Extlookup to look up the value of every variable/parameter used in Puppet.
  • The extlookup() function is built into Puppet as of version 2.6.0.

CONS

  • You must use comma-separated value files (CSV) ONLY for your lookups (i.e. variable, value), so structured data (like arrays and hashes) is not supported.
  • Data lookups only return the first-matched value.
  • It doesn’t have the ability to concatenate a list of matches returned throughout the full hierarchy.

Introducing: Hiera

Hiera, short for “hierarchy” and written by R.I. Pienaar, is a pluggable, hierarchical database that can query YAML and JSON files (and any other data serialization for which you write a custom backend), as well as Puppet manifests, for configuration data. Hiera builds upon the model that Extlookup created and also adds support for structured data. With Hiera, you can dynamically lookup parameters based on a node’s Facter facts. Let’s look configuring Hiera for use with the previous example:

The hiera.yaml configuration file:

1
2
3
4
5
6
---
:backends: - yaml
:hierarchy: - %{environment}
            - common
:yaml:
    :datadir: /etc/puppetlabs/puppet/hieradata

The common.yaml file that Hiera uses for parameter lookup:

1
2
3
---
dnsserver    : '8.8.8.8'
searchdomain : 'puppetlabs.vm'

Puppet code using Hiera:

1
2
3
4
5
6
7
8
class dns {
  $dnsserver    = hiera('dnsserver')
  $searchdomain = hiera('searchdomain')
 
  file { '/etc/resolv.conf':
    content => "search ${searchdomain}\n nameserver ${dnsserver}\n",
  }
}

PROS

  • Data is truly separated from your Puppet code—it exists in an entirely separate directory structure.
  • Parameter lookup is hierarchical and dynamic based on Facter facts that describe your node.
  • Hiera supports structured data—like arrays and hashes—that can be fed back to Puppet.
  • Using Hiera, your Puppet modules contain zero proprietary data (which makes the module much more portable).
  • Hiera will be integrated with the next version of Puppet (codenamed Telly).

CONS

  • As of this writing Hiera is not YET built into Puppet , so utilizing it requires an initial installation step.

Conclusion

While there are a myriad of options to solve the problem of configuration data and Puppet code separation, we recommend using Hiera for its ability to adapt to every situation. This post only gives a brief glimpse of its awesome functionality. Stay tuned for a post dedicated to Hiera, where we will be looking in-depth at its usage, flexibility, and advanced features that can simplify the management of your environment whether you’re a sysadmin of 10, 100, or 10,000 nodes!

Additional Resources

R.I. Pienaar Joins Puppet Labs

Posted on
By
Jason
in
Blog, Company, General News, MCollective
Responses
4 Comments »

I am very pleased to announce that R.I. Pienaar, founder and lead developer of the widely used Marionette Collective (MCollective) orchestration tools, has joined Puppet Labs as a Software Architect. R.I.’s message-based orchestration tools have become some of the most widely used tools in systems management, and have literally changed the way that people handle ad-hoc command and control, orchestration, and parallel job management. Having R.I. join the Puppet Labs team is a significant milestone for us, as R.I. will help shape product efforts in MCollective, Puppet, and Puppet Enterprise.

Puppet Labs acquired MCollective from R.I. in late 2010. Since then, MCollective has become a critical piece of the Puppet infrastructure with direct integration in Puppet Enterprise for Live Management, as well as standard orchestration functionality. R.I.’s efforts in this regard have had significant impact on product for the company including Puppet Enterprise 2.0.

R.I. will continue to work on MCollective (since MCollective is now directly integrated into our products), but will also work on some of our new projects to be announced in the future. We are very glad to have R.I.’s considerable creativity and industry experience on the team helping with new products that will give sysadmins new tools and delight Puppet users.

In R.I.’s own words:

I’m really excited to finally be part of the Puppet Labs team. I’ve been part time member for over a year and it’s great to finally be a full time team member. I look forward to the opportunities this new venture brings in helping me further the DevOps eco-system.

Verifying Puppet: Checking Syntax and Writing Automated Tests

Posted on
By
Adrien
in
Blog, Community, DevOps, How to, Systems Management, Tips
Responses
6 Comments »

One of the issues that crops up when working with Puppet is ensuring that your manifests do what you expect. Errors are bound to happen. A missed brace can make a manifest not compile, or forgetting to include a module or set a variable may mean that running Puppet on the host fails to enforce the expected state. All in all, it would help to have some tools to make sure we’re writing valid code, that it does what it expects, and that if it doesn’t we catch it as soon as possible.

Syntax Checking

At the lowest level of checking, you can use the Puppet parser to do syntax validation. Typos and errors are bound to creep into code, so syntax checking at the end of a long day can go far to improve the quality of your life.

There are a couple of places where you can insert syntax validation. One method is by manually running `puppet parser validate selinux.pp` to make sure that the manifest can be parsed before you commit your changes or deploy them to a live environment. If I left out a curly brace in a manifest and then used ‘puppet parser validate selinux.pp’, then:

    % puppet parser validate selinux.pp
    err: Could not parse for environment production: Syntax error at ‘{‘; expected '}' at /Users/adrien/puppetlabs-mrepo/manifests/repo.pp:252
    err: Try 'puppet help parser validate' for usage

Puppet parser tells me what went wrong, and which line contains the error.

In addition, you can integrate syntax checking into your editor. Vim has built in code compilation functionality that can be used to run error checking, so you can quickly validate your code and jump to sections of code with syntax errors. There are plugins like Syntastic that will do continuous checking so you’re immediately alerted when syntax errors are made. For example, if you used Syntastic with the same syntax error, you would see this:

Syntastic error output

Syntastic output identifies your syntax errors.

Lastly, there’s the puppet-lint tool developed by GitHub’s Tim Sharpe that will analyze your manifests and look for deviations from the Puppet style guide. It’s a quick and easy way to ensure that everybody is following a common set of conventions, so as your module collection grows, you’ll have a consistent set of modules instead of sections with cobwebs. Running puppet-lint against a manifest could produce something like the following:

% puppet-lint init.pp 
    WARNING: top-scope variable being used without an explicit namespace on line 79
    WARNING: top-scope variable being used without an explicit namespace on line 81
    WARNING: define defined inside a class on line 59
    ERROR: single quoted string containing a variable found on line 124
    WARNING: string containing only a variable on line 81
    WARNING: => on line isn't properly aligned for resource on line 71
    ERROR: two-space soft tabs not used on line 50
    WARNING: line has more than 80 characters on line 83
    WARNING: line has more than 80 characters on line 84
    ERROR: trailing whitespace found on line 163
    WARNING: mode should be represented as a 4 digit octal value on line 55

Be warned that these steps only validate syntax. If you have variables that are incorrectly spelled or have a bad value, your code will still be completely valid—and will not do what you want. That’s why we have additional tools available to make sure you code actually works as intended.

Writing Automated Tests

Automated testing is one of the key ways to ensure that your libraries and manifests are meeting your expectations. Out of all the ways that you could test your manifests, I’ll highlight two: testing modules and their catalogs, and testing entire systems.

Testing Modules

As mentioned in a previous post by our Release team, you can add rspec and cucumber tests to ensure that your modules are creating resources as you expect.

For example, you can write tests that ensure that when including a module to install Apache, the package Apache is installed and the service is started. If you were to further develop on that Apache module, you could move forward knowing that the tests would always ensure those basic behaviors would still exist, and if something changed by accident you would definitively know it had changed.

The puppet-apt module has been a focus of a lot of testing, and is a great example of how you can test your modules. Given rspec tests looking like this:

   it { should create_exec("apt_update")\
      .with_subscribe(['File[sources.list]','File[sources.list.d]'])\
      .with_refreshonly(true)}

would produce output like this:

  apt
    Apt class with no parameters, basic test
      should create Class["apt"]
      should create Exec["apt_update"]
 
Finished in 0.36027 seconds
2 example, 0 failures

Writing tests is a good way to verify your modules are functional and reusable. Testing also serves as a indicator of quality, demonstrating you have taken the time to ensure the module does what you want it to do. Additionally, you can verify modules in the Puppet Forge if they have tests, and more easily check them for correctness.

Testing Systems

Unit testing individual modules is a great step to take, but at the end of the day you want to know running Puppet on a host will build the host the way you want, and will have the behavior you expect. Being able to pragmatically verify that services like SSH, Postgres, and nginx are running and serving resources is powerful stuff.

You can use Cucumber in a standalone manner to ensure that if you run Puppet on a host, you get the host you asked for when all is said and done. You can couple unit tests with system tests, and run everything in a testing environment before your changes go to production systems.

Martin Englund has blogged about his experiences with “Behavior Driven Infrastructure” (a play on Behavior Driven Development or BDD) with Cucumber, and did an excellent PuppetConf presentation about his experiences with Puppet and Cucumber.

Benefits of Testing Puppet Code

Everybody wants to have as smooth and seamless a work flow as possible. Deploying changes to your Puppet manifests only to discover that you forgot a comma or brace, or writing a manifest that can’t actually run successfully, can require debugging time that would be better spent elsewhere. Adding a few proactive tools will prevent errors from propagating out, and being able to automatically verify your systems means that you can deploy changes fearlessly and become more agile in your day to day operations.

Additional Resources