Building Your First Argo Workflows Pipeline

As a Staff Software Engineer specializing in Site Reliability Engineering (SRE) and pipelines, I’ve found Argo Workflows to be an excellent tool for orchestrating complex, parallel workflows on Kubernetes. In this article, I’ll guide you through creating your first Argo Workflows pipeline, complete with code snippets and explanations.

What is Argo Workflows?

Argo Workflows is a container-native workflow engine for orchestrating parallel jobs on Kubernetes. It allows you to define complex workflows as a series of steps, each running in its own container.

Getting Started

First, ensure you have a Kubernetes cluster and Argo Workflows installed. Then, let’s create a simple pipeline that demonstrates some key features.

Our First Pipeline

We’ll create a pipeline that does the following:

  1. Generates a random number
  2. Performs two parallel operations on that number
  3. Combines the results

Here’s the YAML definition for our workflow:

kind: Workflow
  generateName: my-first-workflow-
  entrypoint: my-pipeline
  - name: my-pipeline
      - name: generate-number
        template: generate
      - name: process-a
        template: process
          - name: input
            value: "{{steps.generate-number.outputs.result}}"
      - name: process-b
        template: process
          - name: input
            value: "{{steps.generate-number.outputs.result}}"
      - name: combine
        template: combine
          - name: a
            value: "{{steps.process-a.outputs.result}}"
          - name: b
            value: "{{steps.process-b.outputs.result}}"

  - name: generate
      image: python:alpine3.6
      command: [python]
      source: |
        import random
        print(random.randint(1, 100))

  - name: process
      - name: input
      image: python:alpine3.6
      command: [python]
      source: |
        import sys
        input = int(sys.argv[1])
        print(input * 2)

  - name: combine
      - name: a
      - name: b
      image: python:alpine3.6
      command: [python]
      source: |
        import sys
        a, b = int(sys.argv[1]), int(sys.argv[2])
        print(f"Result: {a + b}")

Let’s break down this pipeline:

  1. Workflow Structure: The workflow is defined using Kubernetes-style YAML. It consists of a series of templates, each defining a step in our pipeline.
  2. Entrypoint: The entrypoint field specifies which template to run first. In this case, it’s my-pipeline.
  3. Steps: The my-pipeline template defines the sequence of steps. Steps at the same level (with the same number of dashes) run in parallel.
  4. Generate Number: The first step uses the generate template to create a random number between 1 and 100.
  5. Parallel Processing: The next step runs two processes in parallel (process-a and process-b), both using the process template. They take the generated number as input and double it.
  6. Combine Results: The final step uses the combine template to add the results from the parallel processes.
  7. Templates: Each template (generateprocesscombine) specifies a container image and a Python script to run.
  8. Passing Data: Data is passed between steps using the {{steps.step-name.outputs.result}} syntax.

Running the Workflow

To run this workflow, save it as my-workflow.yaml and execute:

kubectl create -f my-workflow.yaml

You can then monitor the workflow’s progress using:

kubectl get workflows
kubectl get pods


This simple example demonstrates several key features of Argo Workflows:

  • Defining multi-step workflows
  • Running steps in parallel
  • Passing data between steps
  • Using different containers for different tasks

As you become more comfortable with Argo Workflows, you can create more complex pipelines, incorporate conditional logic, use loops, and even nest workflows within each other.

Argo Workflows is a powerful tool for creating scalable, reproducible pipelines in Kubernetes environments. Whether you’re processing data, running CI/CD pipelines, or orchestrating complex analytical workflows, Argo Workflows provides the flexibility and power to meet your needs.

