Variations around the execution of code blocks

31 Mar 2015

(or A list of things that run things. Also, this post is a bit long…)

I recently stumbled upon the following quote from Umberto Eco:

The list is the origin of culture. It’s part of the history of art and literature. What does culture want? To make infinity comprehensible. It also wants to create order — not always, but often. And how, as a human being, does one face infinity? How does one attempt to grasp the incomprehensible? Through lists, through catalogs, through collections in museums and through encyclopedias and dictionaries.

Inspired by it, I decided to try to make one about around a topic that I care about a lot: how to specify how to run something.

The following is then, an enumeration of some systems or tools which in the end result in the execution of a code block by means of some configuration format, DSL, HTTP requests, CLI tool, etc…

I’ll use the term code block to imply that the code snippet is being defined outside of the runtime where it is intended to be run.

Also I decided to narrow down to those in the list below, and will be updating this post to cover them all eventually.

Domain	Name

Configuration Management	Chef, Puppet, Ansible
Infrastructure	Packer, Terraform, Cloud Init, NixOps
Isolation	Vagrant, Docker, CoreOS ACI
Workloads scheduling	Kubernetes, Marathon, Aurora, Heroku style, Fleet, Apcera Continuum, Consul checks
Continuous Integration	Jenkins, Drone.io, Travis, Wercker
Build tools, Task Runners	Make, Rake, Ant, Maven, Pants, Bazel, Gradle
SSH based deploy tools	Capistrano, Fabric

Configuration Management

The purpose of these set of technologies is to make changes into a server like installing packages, creating users, or creating configuration files. A common theme in these tooling is that they should be idempotent, meaning that each one of the commands will have a check to verify whether the execution has been done already or not, and abort if it has.

Chef

Uses: A Ruby DSL
Docs:
- https://docs.chef.io/resource_execute.html
- https://docs.chef.io/resource_bash.html

Chef has some resources which help in the execution of code blocks. The configuration is driven by Ruby, which allows to not only having to rely

The following is based on the example found here. We have 2 code blocks in this case. The first one creates a change in the server and the second one checks whether that code block has been executed in the past already, in order to make the resource idempotent.

%w{rover fido bubbers}.each do |pet_name|
  execute "feed_pet_#{pet_name}" do
    command "echo 'Feeding: #{pet_name}'; touch '/tmp/#{pet_name}'"
    # not_if { ::File.exists?("/tmp/#{pet_name}")}
    not_if "cat /tmp/#{pet_name} | grep Feeding"
  end
end

Puppet

Uses: A special DSL
Docs:
- https://docs.puppetlabs.com/references/latest/type.html#exec

Puppet also tries to achieve idempotent runs of the code blocks. According to the docs:

There are three main ways for an exec to be idempotent:

The command itself is already idempotent. (For example, apt-get update.) The exec has an onlyif, unless, or creates attribute, which prevents Puppet from running the command unless some condition is met. The exec has refreshonly => true, which only allows Puppet to run the command when some other resource is changed.

Here the execution is driven via a specialized DSL (though Ruby inspired?) in order to make things declarative.

An example of its usage:

$my_file_arg = '/tmp/myarg.txt'

file { $my_file_arg:
  content => "Hello now",
}

exec {"Use $my_file_arg":
  require => File[$my_file_arg],
  command => "/bin/sed -i s/Hello/Bye/g $my_file_arg",
}

Here we have 2 types of code blocks:

One expressing the contents of a file
Another expressing the execution of a sed command

Ansible

Uses: YAML
Docs:
- http://docs.ansible.com/shell_module.html

Ansible uses YAML for its configuration.

Example from the docs

# Execute the command in remote shell; stdout goes to the specified
# file on the remote.
- shell: somescript.sh >> somelog.txt

# Change the working directory to somedir/ before executing the command.
- shell: somescript.sh >> somelog.txt chdir=somedir/

# You can also use the 'args' form to provide the options. This command
# will change the working directory to somedir/ and will only run when
# somedir/somelog.txt doesn't exist.
- shell: somescript.sh >> somelog.txt
  args:
    chdir: somedir/
    creates: somelog.txt

Here each one of the code blocks are executed by using the shell configuration directive and then its execution is modified by setting options like creates which will trigger an idempotency check and abort the execution of the command if the file already exists.

Infrastructure

These days there are increasing number of possibilities of Cloud APIs which streamline the acquisition of computing resources. Though this also means that the number of layers has increased as well and thus new types of configuration and declarative approaches need to be find to orchestrate what we want to do with those resources.

Some use cases are like making calls to a cloud api like AWS, Google Compute Engine, to get resources and chain the result to the execution of a code block which furthers configures what we want to do with the resource, or yet again persisting those changes back to create a new type of resource (a new container or instance type for example.)

Packer

Uses: JSON
Documentation:
- https://www.packer.io/docs/command-line/build.html
- https://www.packer.io/docs/provisioners/shell.html

Packer counts with a shell provisioner.

The description from the website notes:

The shell Packer provisioner provisions machines built by Packer using shell scripts. Shell provisioning is the easiest way to get software installed and configured on a machine.

As an example, we can have JSON express what we want to do with the execution of the code block

{
  "type": "shell",
  "inline": ["echo foo"]
}

The execution of the remote resource then, is driven by the JSON format. Here is a more complex example I could find, one that invokes Ansible.

"provisioners": [
  {
    "type": "shell",
    "inline": [
      "mkdir .ssh",
      "echo '{{user `public_key`}}' >> .ssh/authorized_keys"
    ]
  },
  {
    "type": "shell",
    "execute_command": "echo '{{user `ssh_pass`}}' | {{ .Vars }} sudo -E -S sh '{{ .Path }}'",
    "inline": [
      "add-apt-repository ppa:rquillo/ansible",
      "apt-get update",
      "apt-get install -y ansible",
      "echo '%sudo    ALL=(ALL)  NOPASSWD:ALL' >> /etc/sudoers"
    ]
  },
  {
    "type": "ansible-local",
    "playbook_file": "site.yml"
  }
]

Here the provisioners are chained sequentially. One notable example is that we are now defining another sub code block named execute_command which is prepended to the execution of the original code block.

To many new users, the execute_command is puzzling. However, it provides an important function: customization of how the command is executed. The most common use case for this is dealing with sudo password prompts.

Terraform

Uses: The special Terraform format
Docs:
- https://www.terraform.io/docs/provisioners/remote-exec.html

Terraform is an interesting case since it recognizes the limitations of using JSON and YAML to drive the execution of a provisioning run.

The following is an example of applying puppet, also taken from the docs.

# Run puppet and join our Consul cluster
resource "aws_instance" "web" {
    ...
    provisioner "remote-exec" {
        inline = [
        "puppet apply",
        "consul join ${aws_instance.web.private_ip}"
        ]
    }
}

Here we are expressing that there is going to be a computing resource in AWS, and then when the resource is ready, the code block would be executed in that environment.

Cloud Config and Cloud Init

Cloud config is an interesting case. Its execution is leveraged via Convention Over Configuration approach where anything under a certain path will be executed on the node.

The execution in this case is driven by YAML as in Kubernetes.

Here is an example of using runcmd (example taken from here)

#cloud-config

# run commands
# default: none
# runcmd contains a list of either lists or a string
# each item will be executed in order at rc.local like level with
# output to the console
# - if the item is a list, the items will be properly executed as if
#   passed to execve(3) (with the first arg as the command).
# - if the item is a string, it will be simply written to the file and
#   will be interpreted by 'sh'
#
# Note, that the list has to be proper yaml, so you have to escape
# any characters yaml would eat (':' can be problematic)
runcmd:
 - [ ls, -l, / ]
 - [ sh, -xc, "echo $(date) ': hello world!'" ]
 - [ sh, -c, echo "=========hello world'=========" ]
 - ls -l /root
 - [ wget, "http://slashdot.org", -O, /tmp/index.html ]

NixOps

Uses: Nix configuration format
Docs:
- Site: http://nixos.org/nixops/
- Manual: http://nixos.org/nixops/manual/
- There is a paper! http://nixos.org/~eelco/pubs/charon-releng2013-final.pdf

NixOps is a super interesting solution! Here is the description that can be found in the site:

NixOps is a tool for deploying NixOS machines in a network or cloud. It takes as input a declarative specification of a set of “logical” machines and then performs any necessary steps actions to realise that specification: instantiate cloud machines, build and download dependencies, stop and start services, and so on. NixOps has several nice properties:

Here is an example of using it to setup Mediawiki and below is an edited version of it. We can find that there is an installPhase block, as well as an script whcih is used to prepare the postgres database.

  # !!! Cut&paste, extremely ugly.
  # Unpack Mediawiki and put the config file in its root directory.
  mediawikiRoot = pkgs.stdenv.mkDerivation rec {
    name= "mediawiki-1.15.5";

    src = pkgs.fetchurl {
      url = "http://download.wikimedia.org/mediawiki/1.15/${name}.tar.gz";
      sha256 = "1d8afbdh3lsg54b69mnh6a47psb3lg978xpp277qs08yz15cjf7q";
    };

    buildPhase = "true";

    installPhase =
      ''
        mkdir -p $out
        cp -r * $out
      '';
  };

  ...

  jobs.init_mediawiki_db =
    { task = true;
      startOn = "started postgresql";
      script =
        ''
          mkdir -p /var/lib/psql-schemas
          if ! [ -e /var/lib/psql-schemas/mediawiki-created ]; then
              ${pkgs.postgresql}/bin/createuser --no-superuser --no-createdb --no-createrole mediawiki
              ${pkgs.postgresql}/bin/createdb mediawiki -O mediawiki
              ( echo 'CREATE LANGUAGE plpgsql;'
                cat ${mediawikiRoot}/maintenance/postgres/tables.sql
                echo 'CREATE TEXT SEARCH CONFIGURATION public.default ( COPY = pg_catalog.english );'
                echo COMMIT
              ) | ${pkgs.postgresql}/bin/psql -U mediawiki mediawiki
              touch /var/lib/psql-schemas/mediawiki-created
          fi
        '';
    };
  
 ...

};

Isolation

(Note: Not sure if isolation would be right word for these.)

What these do is automate the creation of another environment within another local environment by using virtualization or container technologies.

Vagrant

Uses: A Ruby DSL (Vagrantfile)
Docs:
- https://docs.vagrantup.com/v2/provisioning/basic_usage.html
- https://docs.vagrantup.com/v2/push

Vagrant is a very popular tool which helps in the creation of local virtual environments.

Vagrant uses a Vagrantfile to specify the configuration and execution of code blocks within the created resource:

Vagrant.configure("2") do |config|
  config.vm.provision "shell", run: "always" do |s|
    s.inline = "echo hello"
  end
end

There is also a related push option, which can be used to code blocks locally:

config.push.define "local-exec" do |push|
  push.inline = <<-SCRIPT
    scp . /var/www/website
  SCRIPT
end

Docker

Uses: The Dockerfile format
Docs:
- https://docs.docker.com/reference/builder/

Docker uses its own basic configuration format. Maybe due to the nature of Docker layers, it emphasizes running one liners via its RUN directive:

# Comment
RUN echo 'we are running some # of cool things'

But in the end, what will continue to run is what is defined in its ENTRYPOINT:

FROM debian:stable
RUN apt-get update && apt-get install -y --force-yes apache2
EXPOSE 80 443
VOLUME ["/var/www", "/var/log/apache2", "/etc/apache2"]
ENTRYPOINT ["/usr/sbin/apache2ctl", "-D", "FOREGROUND"]

We can see that along with the execution of the code block, it is also being defined the folders and port mapping that are required to execute the code block.

CoreOS Appc

Uses: JSON
Docs:
- https://github.com/appc/spec/blob/master/SPEC.md#app-container-executor

The Application Container Specification uses JSON to define an Image Manifest. The commands to execute are comma separated, and there is a chain of execution via event handlers.

"app": {
       "exec": ["/usr/bin/reduce-worker", "--quiet"],
       "user": "100",
       "group": "300",
       "eventHandlers": [
           {
               "exec": [
                   "/usr/bin/data-downloader"
               ],
               "name": "pre-start"
           },
           {
               "exec": [
                   "/usr/bin/deregister-worker",
                   "--verbose"
               ],
               "name": "post-stop"
           }
       ],
       "workingDirectory": "/opt/work",
       "environment": [
           {
               "name": "REDUCE_WORKER_DEBUG",
               "value": "true"
           }
       ],

The specification of the resource that the workload would be using can be found under the isolation key:

"isolators": [
            {
                "name": "resources/cpu",
                "value": {
                    "request": "250",
                    "limit": "500"
                }
            },
            {
                "name": "resource/memory",
                "value": {
                    "request": "1G",
                    "limit": "2G"
                }
            },
            {
                "name": "os/linux/capabilities-retain-set",
                "value": {
                    "set": ["CAP_NET_BIND_SERVICE", "CAP_SYS_ADMIN"]
                }
            }
        ],

Build tools and Task runners

These have the common functionality of chaining together the execution of code blocks into steps, dependencies or prerequisities.

Some of them also have notions of idempotency as the configuration management tooling. The classic example of these tools I believe it would be make.

Make

Uses: the Makefile format
Docs:
- Wikipedia entry: http://en.wikipedia.org/wiki/Makefile

Borrowing the example of Wikipedia as well:

Here is a simple makefile that describes the way an executable file called edit depends on four object files which, in turn, depend on four C source and two header files.

edit : main.o kbd.o command.o display.o 
    cc -o edit main.o kbd.o command.o display.o
 
main.o : main.c defs.h
    cc -c main.c
kbd.o : kbd.c defs.h command.h
    cc -c kbd.c
command.o : command.c defs.h command.h
    cc -c command.c
display.o : display.c defs.h
    cc -c display.c
 
clean :
     rm edit main.o kbd.o command.o display.o

We invoke a code block using make clean, which will trigger the execution of the clean code block. On the other hand,

Rake

Uses: a Ruby DSL
Docs:
- https://github.com/ruby/rake
- http://ruby-doc.org/core-1.9.3/doc/rake/rakefile_rdoc.html

From its description:

Rake is a Make-like program implemented in Ruby. Tasks and dependencies are specified in standard Ruby syntax.

A simple example from the docs:

The following file task creates a executable program (named prog) given two object files name a.o and b.o. The tasks for creating a.o and b.o are not shown.

file "prog" => ["a.o", "b.o"] do |t|
  sh "cc -o #{t.name} #{t.prerequisites.join(' ')}"
end

It is also possible to run the tasks in parallel:

multitask :highlight => FileList["listings/*"]

rule ".html" => ->(f){ FileList[f.ext(".*")].first } do |t|
  sh "pygmentize -o #{t.name} #{t.source}"
end

Ant

Uses: XML
Docs:

According to Wikipedia:

One of the primary aims of Ant was to solve Make’s portability problems.

Below is an example from the Wikipedia entry:

<?xml version="1.0"?>
<project name="Hello" default="compile">
    <target name="clean" description="remove intermediate files">
        <delete dir="classes"/>
    </target>
    <target name="clobber" depends="clean" description="remove all artifact files">
        <delete file="hello.jar"/>
    </target>
    <target name="compile" description="compile the Java source code to class files">
        <mkdir dir="classes"/>
        <javac srcdir="." destdir="classes"/>
    </target>
    <target name="jar" depends="compile" description="create a Jar file for the application">
        <jar destfile="hello.jar">
            <fileset dir="classes" includes="**/*.class"/>
            <manifest>
                <attribute name="Main-Class" value="HelloProgram"/>
            </manifest>
        </jar>
    </target>
</project>

Among the XML, we can see the tasks are chained together via depends.

To execute a script, there is an exec task, where each one of the arguments to the command line are defined via an ordered list of arg tags:

<target name="help">
  <exec executable="cmd">
    <arg value="/c"/>
    <arg value="ant.bat"/>
    <arg value="-p"/>
  </exec>
</target>

Maven

Uses: XML
Docs:
- https://maven.apache.org/guides/mini/guide-using-ant.html

Similar to Ant, an exec block in XML can be used:

<project>
  ...
  <build>
    <plugins>
      <plugin>
        ...
        <executions>
          <execution>
	    ...
            <configuration>
              <tasks>
                <exec
                  dir="${project.basedir}"
                  executable="${project.basedir}/src/main/sh/do-something.sh"
                  failonerror="true">
                  <arg line="arg1 arg2 arg3 arg4" />
                </exec>
              </tasks>
            </configuration>
	    ...
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

Bazel

Uses: Google magic
Docs:
- http://bazel.io/docs/build-encyclopedia.html

Bazel has great FAQ where it succintly explains the issues that Google had delivering its software which resulted in the creation of the original Blaze.

Quoting it:

What is Bazel best at?

Bazel shines at building and testing projects with the following properties:

Projects with a large codebase

Projects written in (multiple) compiled languages

Projects that deploy on multiple platforms

Projects that have extensive tests

What should I not use Bazel for?

Bazel tries to be smart about caching. This means it is a bad match for build steps that should not be cached. For example, the following steps should not be controlled from Bazel:

A compilation step that fetches data from the internet.

A test step that connects to the QA instance of your site.

A deployment step that changes your site’s cloud configuration.

Bazel tries to minimize expensive compilation steps. If you are only using interpreted languages directly, such as JavaScript or Python, Bazel will likely not interest you.

Bazel has a large number of rules which define how to do the builds. Here is an example of running a bash script for testing:

sh_test(
    name = "foo_integration_test",
    size = "small",
    srcs = ["foo_integration_test.sh"],
    deps = [":foo_sh_lib"],
    data = glob(["testdata/*.txt"]),
)

By specifying that the test is small so that is times out after 1 minute.

Pants

Uses: Similar BUILD syntax as Bazel
Docs:
- https://pantsbuild.github.io/

Description from the site:

Pants is similar to make, maven, ant, gradle, sbt, etc.; but pants pursues different design goals.

Pants optimizes for

building multiple, dependent things from source

building code in a variety of languages

speed of build execution

Here is a very basic example from the readme.

python_tests(name='greet',
  dependencies=[
    'examples/src/python/example/hello/greet:greet',
    ':prep',
  ],
  sources=globs('*.py'),
)

# Prepare for the 'greet' test. Realistically, you wouldn't set up a
# prep_command just to create an emtpy temp file. This is meant as a
# simple example.
prep_command(name='prep',
  prep_executable='touch',
  prep_args=['/tmp/prep_command_result']
)

Gradle

Uses: Groovy based DSL
Docs:
- DSL reference: http://gradle.org/docs/current/dsl/
- Exec: http://gradle.org/docs/current/dsl/org.gradle.api.tasks.Exec.htmla
- Wikipedia entry: http://en.wikipedia.org/wiki/Gradle
- Interesting read: https://gradle.org/gradle-team-perspective-on-bazel/
- Article: http://www.drdobbs.com/jvm/why-build-your-java-projects-with-gradle/240168608

According to Wikipedia:

Gradle was designed for multi-project builds which can grow to be quite large, and supports incremental builds by intelligently determining which parts of the build tree are up-to-date, so that any task dependent upon those parts will not need to be re-executed.

In the Exec example, before executing the command, it is defined the working directory where it should run and more interesting, a stream that will capture the output.

task stopTomcat(type:Exec) {
  workingDir '../tomcat/bin'

  //on windows:
  commandLine 'cmd', '/c', 'stop.bat'

  //on linux
  commandLine './stop.sh'

  //store the output instead of printing to the console:
  standardOutput = new ByteArrayOutputStream()

  //extension method stopTomcat.output() can be used to obtain the output:
  ext.output = {
    return standardOutput.toString()
  }
}

Continuous Integration

CI tools help in automating the creation of build artifacts and running of tests from a project. In a sense, one could say that they are also schedulers as well, though specialized in the domain of running tests and creating steps which result in a release (batches).

Jenkins

Uses: HTML textareas or XML.
Docs:
- https://wiki.jenkins-ci.org/display/JENKINS/Home

Jenkins is an established open source CI solution with a large number of plugins, very extensible.

Although most of its usage would be through HTML forms, there is a way to schedule Jenkins jobs via XML. Meaning that it is XML, the environment will be a little bit more unnatural than in other solutions since the code will have to be escaped for example so that it includes entities which make it conform valid XML.

Drone.io

Uses: An HTML textarea and checkboxes
Docs:
- http://docs.drone.io/buildscript.html

Via the UI, it is possible to specify the environment variables and then within a text area write the script to be executed.

Travis

Uses: YAML
Docs:
- http://docs.travis-ci.com/user/build-configuration/
- http://docs.travis-ci.com/user/customizing-the-build/

Travis is a great CI as a service solution, (which is also open source).

Configuration is done via a local .travis.yml file which is located at the root of a repository directory. In the example of the docs below, we have 2 code blocks, one that defines a list of install steps which provision an environment so that the script code block is executed successfully.

install:
  - bundle install --path vendor/bundle
  - npm install

script: bundle exec thor build

Wercker

Uses: YAML
Docs:
- http://devcenter.wercker.com/docs/
- http://devcenter.wercker.com/learn/wercker-yml/01_introduction.html

From the docs:

The wercker.yml file is a configuration file that specifies how your build and deploy pipelines should be run and which steps should be executed in those pipelines.

And a full example below. As part of a Wercker build pipeline, we can find a series of steps which will be run sequentially. The box option defines the environment, and then code is used to define the code block.

# use the default golang container from Docker Hub
box: golang
# The steps that will be executed in the build pipeline
build:
  steps:
    # golint step!
    - wercker/golint

    # Build the project
    - script:
        name: go build
        code: |
          go build ./...

    # Test the project
    - script:
        name: go test
        code: |
          go test ./...

# The steps that will be executed in the deploy pipeline
deploy:
    steps:
        # Execute the s3sync deploy step, a step provided by wercker
        - s3sync:
            key_id: $AWS_ACCESS_KEY_ID
            key_secret: $AWS_SECRET_ACCESS_KEY
            bucket_url: $AWS_BUCKET_URL
            source_dir: build/

SSH based deploy tools

What these tool do is automate the remote (possibly parallel) execution of commands in a number of servers.

Capistrano

Uses: A Ruby DSL
Docs:
- http://capistranorb.com/documentation/getting-started/tasks/

With Capistrano we can define a list of servers where we want to execute a task (defined with :execute).

server 'example.com', roles: [:web, :app]
server 'example.org', roles: [:db, :workers]
desc "Report Uptimes"
task :uptime do
  on roles(:all) do |host|
    execute :any_command, "with args", :here, "and here"
    info "Host #{host} (#{host.roles.to_a.join(', ')}):\t#{capture(:uptime)}"
  end
end

Fabric

Uses: Python
Docs:
- http://docs.fabfile.org/en/latest/usage/execution.html

Similar to Capistrano, here we define a list of the servers and then use a command line tool to execute actions on them.

from fabric.api import run, env

env.hosts = ['host1', 'host2']

def taskA():
    run('ls')

def taskB():
    run('whoami')

Workloads scheduling

Once having defined the infrastructure that is is desired, maybe by building upon the technologies in the list above, it is possible to create another abstraction around the computing resources so that those running a workload can focus on how something should be executed rather than than detailing how to prepare the necessary infrastructure so that the workload runs. These tools are usually referred to as PaaS systems or some of them with more simple features are just considered Schedulers. Thanks to Mesos, there is an increasing number of these so only covering some of them.

Kubernetes

Uses: JSON
Docs:
- https://github.com/GoogleCloudPlatform/kubernetes/blob/master/examples/update-demo/nautilus-rc.yaml

In the case of Kubernetes, the execution is driven via a YAML file.

A couple of examples below:

Example: An Nginx service

Explicitly say it is a Service
Describe the ports it will use
Set a constraint about where to run the service

kind: Service
apiVersion: v1beta1
id: nginx-example
# the port that this service should serve on
port: 8000
# just like the selector in the replication controller,
# but this time it identifies the set of pods to load balance
# traffic to.
selector:
  name: nginx
# the container on each pod to connect to, can be a name
# (e.g. 'www') or a number (e.g. 80)
containerPort: 80

Not very clear what it is running, but it seems that an internal containerport will be exposed as the port 8000 and that it will only be running in nodes tagged to be running nginx workloads.

The full example is here.

Example: A workload with a Healthcheck

Here we have a container that has a livenessProbe, which can be done by either a command or a http request.

There are 2 code blocks: the liveness-exec which is going to be periodically writing ok into /tmp/health and its liveness probe, which is another code block that will be checking cat /tmp/health

apiVersion: v1beta1
desiredState:
  manifest:
    containers:
      - image: busybox
        name: liveness
        livenessProbe:
          exec:
            command:
              - "cat"
              - "/tmp/health"
          initialDelaySeconds: 15
        command:
          - "/bin/sh"
          - "-c"
          - "echo ok > /tmp/health; sleep 10; echo fail > /tmp/health; sleep 600"
    id: liveness-exec
    version: v1beta1
id: liveness-exec
kind: Pod
labels:
  test: liveness

We can see some of the limitations already in deciding to use YAML for this since it looks unnatural that now a command has to be break apart and fit into an array structure by using YAML lists.

Marathon

Uses: JSON
Docs:
- https://github.com/mesosphere/marathon

In Marathon, scheduling of workloads is done via JSON payloads done to an HTTP API.

Here is an example of starting a couple of code blocks, one which does a healthcheck and another one which is the job itself.

{
  "id": "bridged-webapp",
  "cmd": "python3 -m http.server 8080",
  "cpus": 0.25,
  "mem": 64.0,
  "instances": 2,
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "python:3",
      "network": "BRIDGE",
      "portMappings": [
        { "containerPort": 8080, "hostPort": 0, "servicePort": 9000, "protocol": "tcp" },
        { "containerPort": 161, "hostPort": 0, "protocol": "udp"}
      ]
    }
  },
  "healthChecks": [
    {
      "protocol": "HTTP",
      "portIndex": 0,
      "path": "/",
      "gracePeriodSeconds": 5,
      "intervalSeconds": 20,
      "maxConsecutiveFailures": 3
    },
    {
      "protocol": "COMMAND",
      "command": { "value": "curl -f -X GET http://$HOST:$PORT" },
      "gracePeriodSeconds": 5,
      "intervalSeconds": 20,
      "maxConsecutiveFailures": 3
    }

  ]
}

Via the JSON configuration we are able to say transparently modify the execution of the code block and express that it should be done using a runtime which has python:3 and a certain number of ports open.

For the healthcheck code block, it is defined the path and one liner that should be executed to consider that the other job is healthy or not. It is also expressed that after 3 failures something would happen, though not expressed explicitly in the configuration.

Aurora

Uses: A sophisticated DSL in Python (according to the description in the readme)
Docs: https://github.com/apache/aurora

Aurora is another Mesos based scheduler to execute code blocks.

An example from the docs is below.

pkg_path = '/vagrant/hello_world.py'

# we use a trick here to make the configuration change with
# the contents of the file, for simplicity.  in a normal setting, packages would be
# versioned, and the version number would be changed in the configuration.
import hashlib
with open(pkg_path, 'rb') as f:
  pkg_checksum = hashlib.md5(f.read()).hexdigest()

# copy hello_world.py into the local sandbox
install = Process(
  name = 'fetch_package',
  cmdline = 'cp %s . && echo %s && chmod +x hello_world.py' % (pkg_path, pkg_checksum))

# run the script
hello_world = Process(
  name = 'hello_world',
  cmdline = 'python hello_world.py')

# describe the task
hello_world_task = SequentialTask(
  processes = [install, hello_world],
  resources = Resources(cpu = 1, ram = 1*MB, disk=8*MB))

jobs = [
  Service(cluster = 'devcluster',
          environment = 'devel',
          role = 'www-data',
          name = 'hello_world',
          task = hello_world_task)
]

The Aurora documentation has a helpful section regarding about what is being defined in the example:

What’s Going On In That Configuration File?

More than you might think.

From a “big picture” viewpoint, it first defines two Processes. Then it defines a Task that runs the two Processes in the order specified in the Task definition, as well as specifying what computational and memory resources are available for them. Finally, it defines a Job that will schedule the Task on available and suitable machines. This Job is the sole member of a list of Jobs; you can specify more than one Job in a config file.

At the Process level, it specifies how to get your code into the local sandbox in which it will run. It then specifies how the code is actually run once the second Process starts.

Fleet

Uses: Same style as Systemd
Docs:
- https://github.com/coreos/fleet
- https://coreos.com/docs/launching-containers/launching/launching-containers-fleet/

The CoreOS guide has a good example of how to modify how to run a container on it:

[Unit]
Description=My Apache Frontend
After=docker.service
Requires=docker.service

[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill apache1
ExecStartPre=-/usr/bin/docker rm apache1
ExecStartPre=/usr/bin/docker pull coreos/apache
ExecStart=/usr/bin/docker run -rm --name apache1 -p 80:80 coreos/apache /usr/sbin/apache2ctl -D FOREGROUND
ExecStop=/usr/bin/docker stop apache1

[X-Fleet]
Conflicts=apache@*.service

By using ExecStartPre, the lines from a code block will accumulate and executed before running the container which has an Apache service. It is also specified that such code block should not be run in the same machine by using the Conflicts option (more options here).

Heroku

Uses: Procfiles and Buildpacks
Docs:
- https://devcenter.heroku.com/articles/procfile
- https://devcenter.heroku.com/articles/buildpacks

Actually is no longer just a hosting option, but a set of practices which inspired other technologies like Flynn, Deis.io, Dokku and Cloudfoundry.

In case of Flynn, the code block execution is done via Procfiles (link).

A Procfile based application modifies the execution of a code block by prepending a tag to the start command. For example:

$ cat Procfile
web: node web.js

In order to modify the environment of where that command would be run, buildpacks are used. This is done by calling 3 possible other code blocks: detect, compile and release (docs).

detect sends to stdout the type of application
compile makes changes to the environment which will persisted for code blocks which will be run in the same environment later on.
release communicates YAML back to the scheduler for later reuse For example, from the Clojure buildpack:
```
cat <<EOF
---
config_vars:
default_process_types:
  web: lein trampoline run
EOF
```

Atlas

Uses: JSON
- https://atlas.hashicorp.com/features/develop
- https://atlas.hashicorp.com/help/getting-started/package-services-with-artifacts

Atlas is a gestalt of all the products from Hashicorp which in the end runs a workload on a specified infrastructure.

Below is an example of how something is run (taken from the docs here).

{
    "builders": [{
        "type": "amazon-ebs",
        "access_key": "ACCESS_KEY_HERE",
        "secret_key": "SECRET_KEY_HERE",
        "region": "us-east-1",
        "source_ami": "ami-de0d9eb7",
        "instance_type": "t1.micro",
        "ssh_username": "ubuntu",
        "ami_name": "atlas-example {{timestamp}}"
    }],
    "push": {
      "name": "<username>/example-build-configuration"
    },
    "provisioners": [
    {
        "type": "shell",
        "inline": [
            "sleep 30",
            "sudo apt-get update",
            "sudo apt-get install apache2 -y"
        ]
    }],
    "post-processors": [
      {
        "type": "atlas",
        "artifact": "<username>/example-artifact",
        "artifact_type": "aws.ami",
        "metadata": {
          "created_at": "{{timestamp}}"
        }
      }
    ]
}

Apcera Continuum

Uses: Same configuration format as gnatsd
Docs:
- http://docs.apcera.com/introduction/introducing-continuum/

Continuum is one of my favorite platforms today. It is very futuristic, waaaay ahead of anything else that exists today.

Not only is it possible to specify directives to modify how something is run, it is possible to script the interactions from the platform itself!

To define what is being executed or packaged (example), build blocks are used:

environment { "PATH":    "/opt/apcera/go1.3.linux-amd64/bin:$PATH",
              "GOROOT":  "/opt/apcera/go1.3.linux-amd64",
              "GOPATH":  "/opt/apcera/go" }

build (
      export GOPATH=/opt/apcera/go
      (
            sudo mkdir -p $GOPATH
            sudo chown -R `id -u` $GOPATH
            cd $GOPATH
            mkdir -p src bin pkg
      )
      export INSTALLPATH=/opt/apcera/go1.3.linux-amd64
      tar -zxf go1.3.linux-amd64.tar.gz
      sudo mkdir -p ${INSTALLPATH}
      sudo cp -a go/. ${INSTALLPATH}

      # Install godeps
      export PATH=$INSTALLPATH/bin:$PATH
      export GOROOT=$INSTALLPATH
      go get github.com/apcera/godep
)

And for the execution of a code block, options like start_cmd and resources are used.

# The command to start the app. If unset the stager will
# attempt to auto detect the start command based on the
# app framework used.
start_cmd: "bundle exec rackup config.ru -p $PORT"

# Resources allocated to the job.
resources {
  # CPU allocated to the job. Calculated in ms/s.
  # Default: 0, uncapped
  cpu: "0"

  # Disk space to allow for the application.
  # Default: 1024MB
  disk_space: "768MB"

  # Memory the job can use.
  # Default: 256MB
  memory: "256MB"

  # Network bandwidth allocated to the job.
  # Default: 5Mbps
  network_bandwidth: "10Mbps"
}

Also interesting is that the platform makes it possible to parameterize files providing info about how the file is being run.

In the example below, uuid and name is information that comes directly from the platform.

  # Link: https://github.com/apcera/continuum-sample-apps/blob/master/example-ruby-manifest/app.rb#L18
  get "/template" do
    "scalars:<br />
uuid: {{uuid}}<br />
name: {{name}}<br />
num_instances: {{num_instances}}<br />
cpu: {{cpu}}<br />
memory: {{memory}}<br />
disk: {{disk}}<br />
...*edited*...
"
  end

Cron

Uses: Cron configuration format

Just for completeness, the classic cron syntax. From Wikipedia:

The following specifies that the Apache error log clears at one minute past midnight (00:01) of every day of the month, or every day of the week, assuming that the default shell for the cron user is Bourne shell compliant:
1 0 * * *  printf > /var/log/apache/error_log
  

Consul

Uses: JSON
Docs:
- https://www.consul.io/docs/agent/checks.html

More like an honorable mention, Consul for doing monitoring also periodically executes checks (similar to the liveness probes functionality from Kubernetes for example). It is interesting that Hashicorp decoupled the healthchecks from other parts of their solution.

{
  "checks": [
    {
      "id": "chk1",
      "name": "mem",
      "script": "/bin/check_mem",
      "interval": "5s"
    },
    {
      "id": "chk2",
      "name": "/health",
      "http": "http://localhost:5000/health",
      "interval": "15s"
    },
    {
      "id": "chk3",
      "name": "cpu",
      "script": "/bin/check_cpu",
      "interval": "10s"
    },
    ...
  ]
}

Remarks

Again, what I found interesting of all of these systems and tooling, is that they are variations around the same idea: wrap some configuration around the execution of a code block to transparently add some behavior to its execution.

It is impressive that there are so many different takes on this issue even though that in essence what is happening is more or less the same.

As an alternative, see for example what is being done in the Jepsen tests, where there are no code blocks and they have been assimilated into the code itself.

(defn db []
  (let [running (atom nil)] ; A map of nodes to whether they're running
    (reify db/DB
      (setup! [this test node]
        ; You'll need debian testing for this, cuz etcd relies on go 1.2
        (debian/install [:golang :git-core])

        (c/su
          (c/cd "/opt"
                (when-not (cu/file? "etcd")
                  (info node "cloning etcd")
                  (c/exec :git :clone "https://github.com/coreos/etcd")))

          (c/cd "/opt/etcd"
                (when-not (cu/file? "bin/etcd")
                  (info node "building etcd")
                  (c/exec (c/lit "./build"))))

Something to note as well is that there is a spectrum and repeated functionality among the tools, e.g. chaining, a task that is executed after other task, or piping the output from a command somehow to wrap some logic around it, which makes me think whether eventually there will be some tool which picks up the best parts from them and offers the same functionality in an agnostic way somehow.

ops●tokyo