Writing Drupal 7 input filters with hook_filter_info and DOMDocument

Lately I've been working on a new Drupal 7 theme for the university and while trying to configure the editor we ran into a few issues with adding classes to some dom elements. Primarily we don't want middle users to be able to add any classes or other attributes that we don't want them to use. We've got a set of pre-defined classes that they can use in their content. The wysiwyg_filter module pretty much takes care of enforcing that.

I also needed to do is to provide a filter to add some classes to a few different dom elements a user might include in a page. For instance, we want to add the img-responsive class to all images a middle user adds to the site. That way they are responsive by default, and Drupal's built in input filters makes it pretty simple to set everything up.

You start with implementing the hook_filter_info() hook with some information about your filter:

 * Implements hook_filter_info()
function mymodule_filter_info() {
  $filters = array();
  $filter['filter_image_classes'] = array(
    'title' => t('Adds default classes to images'),
    'process callback' => '_mymodule_images_process',
  return $filters

After that we need to define the process callback for the filter.

function _mymodule_images_process($text, $filter) {
  if (!empty($text)) {
    $dom = new DOMDocument();
    $images = $dom->getElementsByTagName('img');
    foreach ($images as $image) {
      $classes = 'img-responsive other-class';
      $existing_classes = $image->getAttribute('class');
      if (!empty($existing_classes)) {
        $classes .= ' ' . $existing_classes;
      $image->setAttribute('class', $classes);
    $html = $dom->saveHTML();
    return $html;
  return $text;

There you go, you're done. After you go to your input formats page and add the filter, clear the cache, you have a filter that automatically adds the img-responsive class to all images a middle user creates.

Using Puppet, git, and Vagrant to build Drupal Development Environments

Last Saturday I had a chance to hang out with a group of awesome Drupal folks at the Charlotte Drupal Drive In. It great chatting with everyone talking about a few tools I use on a daily basis. One thing I talked briefly talked about was how I manage my development environments and how I use Vagrant, git, and Puppet to help me keep everything in order.

First, you need to download and install Vagrant and whatever virtual machine provider you want to use. I prefer VirtualBox, but this also works with most other vm hosts. It also works with AWS and DigitalOcean.

After you have everything setup you should have the vagrant command available in your shell. The command lets you manage your VMs really easily, you can learn more about the command line interface by reading the documentation.

After you've figured out how to use the command line interface for vagrant, the next step is defining a Vagrantfile for your environment. You can either us the vagrant init command or write one yourself. The vagrant init is a decent boilerplate for building your Vagrantfile. If you clean up the commented code, you should have something like this:

# -*- mode: ruby -*-
# vi: set ft=ruby :


Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
    config.vm.box = "base"

This is all you need to bring up a barebones VM. We're going to make a few changes to it. If you run vagrant up you should have a VM up and running soon.

Now that we have a vagrant file, let's start to think about an architecture for our development environment. I typically like to have a VM for my web server, with the hostname web.local, a VM for my database server, db.local, and a VM for memcache. The hostname for that box is memcache.local. We'll get into why they are named like that later on when I talk about Puppet. We also need to define a network between all of these boxes, it's going to look something like this:

web.local ->
db.local ->
memcache.local ->

Now we can take our Vagrantfile and make it look something like this:

# -*- mode: ruby -*-
# vi: set ft=ruby :


Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
    config.vm.define :web do |web_config|
        web_config.vm.box = "precise64"
        web_config.vm.hostname = 'web'
        web_config.vm.network "private_network", ip: ""

    config.vm.define :db do |db_config|
        db_config.vm.box = "precise64"
        db_config.vm.hostname = 'db'
        db_config.vm.network "private_network", ip: ""

    config.vm.define :memcache do |memcache_config|
        memcache_config.vm.box = "precise64"
        memcache_config.vm.hostname = 'memcache'
        memcache_config.vm.network "private_network", ip: ""

Now if you run vagrant up [box] or just vagrant up, vagrant will bring up 3 virtual machines, set the hostname, and it will create the private network.

We also want to run a few commands on the VM when the user runs vagrant up which runs vagrant provision. To do this we'll use the config.vm.provision options to tell the shell to run a few commands:

[CONFIG].vm.provision :shell, :inline => "/usr/bin/apt-get update"

We also want to forward some ports so we can access the VM from the host computer. We'll use the config.vm.network config options to do this.

[CONFIG].vm.network :forwarded_port, guest: 80, host: 8085

So with this, we're mapping http://localhost:8085 to port 80 on the vm.

Now we need to start talking about how I use git to manage all of this. I start by versioning my Vagrantfile and creating a few subdirectories. The structure should look something like this:


Because I'm on a mac I can use nfs mounts to mount the share/ and share/www to specific places on the virtual machine. For instance, if I'm working on a site on my vm, I will typically mount the share/www/docroot to /var/www/docroot on the VM, or wherever the vhost file says it should be.

[CONFIG].vm.synced_folder "share/www/docroot", "/var/www/docroot", type: "nfs"

This lets me keep all of my development work happening in my vim/tmux session on the host machine. Now, because I'm using git to manage my environment, I can use branches and tags to version my environment and work a lot of different tasks. It also makes it really easy to share your environment with others and bring up development boxes extremely quickly.

The final part here is using puppet and several puppet modules to maintain the software on each VM. Since we're using different descriptive hostnames on each box, we can build our puppet manifest in a way that automates deploying code to different boxes.

You're going to have to start by creating a folder structure in your repo and adding a few more config options to the Vagrantfile. First, you should add a folder for manifests and modules to your repo and create a default.pp file:


I prefer to us git submodules to manage my puppet modules. You should be able to just add the module from within the repo like this:

git submodule add https://github.com/puppetlabs/puppetlabs-stdlib.git modules/stdlib

Then we need to tell Vagrant that we're using puppet and it should load the default.pp file by default. We also want to pass some custom facts, using a tool that comes with puppet called facter.

[CONFIG].vm.provision :puppet do |puppet|
    puppet.module_path = "modules"
    puppet.manifests_path = "manifests"
    puppet.manifest_file  = "default.pp"
    puppet.facter = {
        'fqdn' => '[FQDN]',

Then all you need to do is define some nodes in your default.pp file and start using puppet.

node 'web.local' {

    exec { 'apt-update':
        command => "/usr/bin/apt-get update"

    exec { 'apt-upgrade':
        command => "/usr/bin/apt-get upgrade -y"

    package { 'zsh':
        ensure => 'latest',
        require => [Exec['apt-update'], Exec['apt-upgrade']],

    host { $fqdn:
        ensure => 'present',
        ip => ''

I also tend to setup classes for each functional part of the puppet manifest. So, I would have something like manifests/classes/common.pp that configured my user info, some basic packages need for the box, and other random things. I would have something like this setup:

# In manifests/classes/common.pp
class common {

    exec { 'apt-update':
        command => "/usr/bin/apt-get update"

    exec { 'apt-upgrade':
        command => "/usr/bin/apt-get upgrade -y"

    host { $fqdn:
        ensure => 'present',
        ip => ''

# in manifests/default.pp
import "classes/*"

node 'web.local' {
    include common

node 'db.local' {
    include common

node 'memcache.local' {
    include common

That's about it, you can configure puppet to build the VMs pretty much however you want and you can extend puppet with different modules. There are also plugins for vagrant that let you extend the functionality and a fair amount of cool things.

Messing around with sass @each loops

Loops are one of the best parts of Sass (Syntactically Awesome Style Sheets), they let you define lists and loop over each item in the list.

For example, I've got this:

h1 {
   font-weight: normal;

    $font-sizes: (handhelds 120%, medium-screens 150%);
     @each $size in $font-sizes {
         @include respond-to(nth($size, 1)) {
             font-size: nth($size, 2);

That code will generate something like:

h1 { font-weight: normal; }
@media (max-width: 480px) {
    h1 {
        font-size: 120%;
@media (max-width: 767px) {
    h1 {
        font-size: 150%;

It's really cool how you can use the , inside of your list to split up lists of elements and then access items with the nth() function to grab the item in the sub-list.

How to make fail2ban bans persistent

I've recently started using fail2ban more to ban suspicious traffic on my web servers. It's great because it looks at logs and if an entry matches a regular expression it will perform an action on the IP address from the log. You can make the actions do pretty much anything, typically the action is an iptables rule that will ban the user. The problem is when you restart the fail2ban service fail2ban clears the chain for the filter and parses the current log for matches, not the rotated logs. So you don't ban any IPs that were banned before logrotate rotated the old log.

You can make the bans persistent by setting up a blacklist and automatically loading them when fail2ban is restarted. First, you need to create a file to store blacklisted IPs.

sudo touch /etc/fail2ban/ip.blacklist

Then you can either make a copy or edit the /etc/fail2ban/action.d/iptables-multiport.conf file. I prefer to make a copy of it because I version all of my configs.

In the action config file you have a few different directives, we want to focus on 2, the actionstart and actionban. First, when fail2ban bans an IP we want to not only ban it, but we want to add the IP address to the ip.blacklist file.

actionban = iptables -I fail2ban-<name> 1 -s <ip> -j DROP
            echo <ip> >> /etc/fail2ban/ip.blacklist

Then we want to be sure that the iptables rule is added when fail2ban is started, so we add the following lines of code to the actionstart directive:

actionstart = iptables -N fail2ban-<name>
              iptables -A fail2ban-<name> -j RETURN
              iptables -I INPUT -p <protocol> -m multiport --dports <port> -j fail2ban-<name>
              cat /etc/fail2ban/ip.blacklist | while read IP; do iptables -I fail2ban-<name> 1 -s $IP -j DROP; done

That's it, once you restart fail2ban it will automatically ban all of the IPs in your ip.blacklist file.

Managing site maintenance with Varnish 3.x

Recently I had to push out a few updates to a site that required a few big interface changes that I didn't want the public to see while I was making them. The application is running under Apache and we're using Varnish 3.x as a reverse proxy.

I wanted to be able to have a white list of IPs that can access the site and be able to display a custom error page to the user letting them know that the site is undergoing maintenance. If I were only running Apache I could do it easily in the vhost for the site, but we're using Varnish so we need to stop the request once it hits the server. I could do it with iptables and block traffic to port 80 and 443, but I wanted to display a message to the end user letting them know that the site is under maintenance.

Varnish makes this really easy, all you have to do is define access control lists and populate it with the IP address of machines you want varnish to whitelist.

acl admins {
    ''; # The IP address of my machine

And inside of your sub vcl_recv function you would put a check in to make sure that the client.ip is not included in the admins acl.

if (!(client.ip ~ admins)) {
    error 503 'Service Unavailable';

Finally, we need to display a custom error message to the end user. Because we're using the 503 error code we can use the sub vcl_error directive to generate a page to return to the user.

sub vcl_error {
    if (obj.status == 503) {
        set obj.http.Content-Type = "text/html; charset=utf-8";
        synthetic {"
            <!DOCTYPE html>
                    <title>Site Maintenance</title>
                    <h1>We're doing some maintenance!</h1>
                    <p>This site will be back shortly, we're doing a bit of maintenance.</h1>
    return (deliver);

So, there we go, we have our acl defined with our IPs that varnish will talk to, we have our check to make sure the client.ip is able to talk to varnish, and finally we have our error message. You can put anything in there and even load it from a file if needed.