View on GitHub

rift-python

Routing In Fat Trees (RIFT) implementation in Python

Configuration File Generator

Introduction

As explained in the Configuration File chapter, when you start the RIFT protocol engine, you can optionally specify a configuration file (also known as a topology file) on the command line.

The configuration file specifies, in excruciating detail:

If the configuration file contains more than a single RIFT node (i.e. when simulating a multi-node topology), then the multiple links in the topology are simulated over a single physical link by using different multicast addresses and port numbers (see the Configuration File chapter for details).

Simplifying the generation of configuration files

Manually creating a configuration file for any non-trivial topology is an extremely tedious and error-prone process.

Instead of manually creating a configuration file, you can use the config_generator tool to generate one for you.

The config_generator takes a meta-configuration file (also known as a meta-topology file) as input and produces a configuration file (also known as a topology file) as output.

The meta-configuration file specifies something along the lines of “I want a 5-stage Clos topology with 8 leaf nodes, 8 spine nodes, and 4 superspine nodes” (the detailed syntax of the meta-configuration file is specified below). The config_generator tool takes this meta-configuration file as input and produces the detailed configuration file as output with all the configuration details for each of the 20 nodes and all the links between them.

Using namespaces to simulate multi-node topologies

By default, the config_generator tool generates a single configuration file that contains all the nodes in the topology and that is executed by a single instance of the RIFT protocol engine. There are two issues with this approach:

To address both of these issues, the config_generator tool has a --netns-per-node command-line option to operate in a different mode, called the “network namespace per node” mode. Instead of generating a single topology file that simulates all nodes in a single RIFT-Python instance, the “network namespace per node” mode does the following:

The “network namespace per node” mode is more realistic because it uses a separate Linux interface for each link endpoint, and it is multi-threaded because each RIFT-Python engine runs in a separate Python process.

Command Line Options

Starting the configuration generator

The config_generator.py script is located in the tools subdirectory.

At a minimum, it takes a single command line argument which is the name of the meta-configuration file.

The meta_topology directory contains multiple example meta-configuration files. In the example below we use the meta-configuration file 2c_8x8.yaml (2-level Clos topology with 8 leafs and 8 spines).

By default, config_generator.py writes the generated configuration file to standard output:

(env) $ tools/config_generator.py meta_topology/2c_8x8.yaml
shards:
  - id: 0
    nodes:
      - name: leaf1
        level: 0
        systemid: 1
        rx_lie_mcast_address: 224.0.1.1
        interfaces:
          - name: if1
[...]

Help

The -h or the --help command-line option outputs help documentation:

(env) $ tools/config_generator.py --help
usage: config_generator.py [-h] [-n]
                           input-meta-config-file [output-file-or-dir]

RIFT configuration generator

positional arguments:
  input-meta-config-file
                        Input meta-configuration file name
  output-file-or-dir    Output file or directory name

optional arguments:
  -h, --help            show this help message and exit
  -n, --netns-per-node  Use network namespace per node

Output configuration file

config_generator.py takes an optional second argument which specifies where to write the generated configuration. By default, the output is a configuration file:

(env) $ tools/config_generator.py meta_topology/2c_8x8.yaml topology/generated_2c_8x8.yaml
(env) $ 

This generated configuration file can be used as input to the RIFT-Python protocol engine:

(env) $ python rift --interactive topology/generated_2c_8x8.yaml 
leaf1> show nodes
+--------+--------+---------+
| Node   | System | Running |
| Name   | ID     |         |
+--------+--------+---------+
| leaf1  | 1      | True    |
+--------+--------+---------+
| leaf2  | 2      | True    |
+--------+--------+---------+
.        .        .         .
.        .        .         .
.        .        .         .
+--------+--------+---------+
| spine8 | 16     | True    |
+--------+--------+---------+

leaf1> show interfaces
+-----------+-----------------+-----------+-----------+-------------------+-------+
| Interface | Neighbor        | Neighbor  | Neighbor  | Time in           | Flaps |
| Name      | Name            | System ID | State     | State             |       |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-1001a  | spine-1:if-101a | 101       | THREE_WAY | 0d 00h:00m:07.84s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-1001b  | spine-2:if-102a | 102       | THREE_WAY | 0d 00h:00m:07.83s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-1001c  | spine-3:if-103a | 103       | THREE_WAY | 0d 00h:00m:07.82s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-1001d  | spine-4:if-104a | 104       | THREE_WAY | 0d 00h:00m:07.81s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-1001e  | spine-5:if-105a | 105       | THREE_WAY | 0d 00h:00m:07.81s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-1001f  | spine-6:if-106a | 106       | THREE_WAY | 0d 00h:00m:07.80s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-1001g  | spine-7:if-107a | 107       | THREE_WAY | 0d 00h:00m:07.79s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-1001h  | spine-8:if-108a | 108       | THREE_WAY | 0d 00h:00m:07.78s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+

leaf1> set node spine3
spine3> show interfaces
+-----------+-----------------+-----------+-----------+-------------------+-------+
| Interface | Neighbor        | Neighbor  | Neighbor  | Time in           | Flaps |
| Name      | Name            | System ID | State     | State             |       |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-103a   | leaf-1:if-1001c | 1001      | THREE_WAY | 0d 00h:00m:34.49s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-103b   | leaf-2:if-1002c | 1002      | THREE_WAY | 0d 00h:00m:34.48s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-103c   | leaf-3:if-1003c | 1003      | THREE_WAY | 0d 00h:00m:34.47s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-103d   | leaf-4:if-1004c | 1004      | THREE_WAY | 0d 00h:00m:34.46s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-103e   | leaf-5:if-1005c | 1005      | THREE_WAY | 0d 00h:00m:34.46s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-103f   | leaf-6:if-1006c | 1006      | THREE_WAY | 0d 00h:00m:34.45s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-103g   | leaf-7:if-1007c | 1007      | THREE_WAY | 0d 00h:00m:34.44s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+
| if-103h   | leaf-8:if-1008c | 1008      | THREE_WAY | 0d 00h:00m:34.43s | 0     |
+-----------+-----------------+-----------+-----------+-------------------+-------+

Note: If you get the following error, then see the next subsection on file descriptors for a solution.

OSError: [Errno 24] Too many open files

A note on file descriptors

The current implementation of RIFT-Python uses 4 file descriptors per link end-point, plus some additional file descriptors for the CLI, log files, etc.

Note: the number of file descriptors per link end-point will be decreased to 2 or possibly even 1 in the future.

For example, in the 2c_8x8.yaml meta-topology (2-level Clos topology with 8 leafs and 8 spines) there are 16 nodes in total and each node has 8 link end-points, giving a grand total of 128 link end-points. Each link end-point consumes 4 file descriptions, which means a bit more than 512 file descriptors are needed in total.

UNIX-based operating systems (including Linux and MacOS) have limits on the number of file descriptors than can be open at a given time. There is both a a global limit which is determined when the operating system is compiled, and a per process limit. In practice, only the per process limit matters.

To report the maximum number of file descriptors for processes started in the current shell issue the following command:

(env) $ ulimit -n
256

In this example, the limit of 256 is not sufficient to run the 2c_8x8.yaml meta-topology. To increase the limit to 1024 (in this example) use the following command:

(env) $ ulimit -n 1024

Network namespace per node mode

The --netns-per-node or -n command-line option causes config_generator.py to run in “network namespace per node” mode:

(env) $ tools/config_generator.py --netns-per-node meta_topology/2c_8x8.yaml generated_2c_8x8_dir

In this mode, the second argument (which specifies the output destination) is mandatory and specifies the name an output directory rather than an output file.

The following files are generated in the output directory:

(env) $ ls -1 generated_2c_8x8_dir/
allocations.txt
chaos.sh
check.sh
connect-leaf-1.sh
connect-leaf-2.sh
connect-leaf-3.sh
connect-leaf-4.sh
connect-leaf-5.sh
connect-leaf-6.sh
connect-leaf-7.sh
connect-leaf-8.sh
connect-spine-1.sh
connect-spine-2.sh
connect-spine-3.sh
connect-spine-4.sh
connect-spine-5.sh
connect-spine-6.sh
connect-spine-7.sh
connect-spine-8.sh
leaf-1.yaml
leaf-2.yaml
leaf-3.yaml
leaf-4.yaml
leaf-5.yaml
leaf-6.yaml
leaf-7.yaml
leaf-8.yaml
spine-1.yaml
spine-2.yaml
spine-3.yaml
spine-4.yaml
spine-5.yaml
spine-6.yaml
spine-7.yaml
spine-8.yaml
start.sh
stop.sh

The purpose of the generated files is as follows:

The generated start.sh script creates all the names spaces, all the veth pairs, assigns all IP addresses, and starts the RIFT-Python engine for each node. The output looks as follows (the number at the beginning of each line is the percentage complete):

(env) $ ./generated_2c_8x8_dir/start.sh
Create veth pair veth-1001a-101a and veth-101a-1001a for link from leaf-1:if-1001a to spine-1:if-101a
...
Create veth pair veth-1008g-107h and veth-107h-1008g for link from leaf-8:if-1008g to spine-7:if-107h
Create veth pair veth-1008h-108h and veth-108h-1008h for link from leaf-8:if-1008h to spine-8:if-108h
Create network namespace netns-1001 for node leaf-1
...
Create network namespace netns-107 for node spine-7
Create network namespace netns-108 for node spine-8
Start RIFT-Python engine for node leaf-1
...
Start RIFT-Python engine for node spine-7
Start RIFT-Python engine for node spine-8

Note: the start.sh script can only run on Linux. If you use MacOS, you must first start a docker container and run both config_generate.py and start.sh in there:

(env) $ cd docker
(env) $ ./docker-shell
root@d22f9e82f9b0:/# cd /host
root@d22f9e82f9b0:/host# ./generated_2c_8x8_dir/start.sh
Create veth pair veth-1001a-101a and veth-101a-1001a for link from leaf-1:if-1001a to spine-1:if-101a
...
Create veth pair veth-1008g-107h and veth-107h-1008g for link from leaf-8:if-1008g to spine-7:if-107h
Create veth pair veth-1008h-108h and veth-108h-1008h for link from leaf-8:if-1008h to spine-8:if-108h
Create network namespace netns-1001 for node leaf-1
...
Create network namespace netns-107 for node spine-7
Create network namespace netns-108 for node spine-8
Start RIFT-Python engine for node leaf-1
...
Start RIFT-Python engine for node spine-7
Start RIFT-Python engine for node spine-8

See the Docker chapter for details on Docker usage in RIFT-Python.

Once you have run start.sh to start the topology, you can use connect-*.sh to start a Telnet session to any of the running nodes. For example:

root@d22f9e82f9b0:/host# ./generated_2c_8x8_dir/connect-spine-3.sh
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
spine3> show interfaces
+-----------------+------------------------+-----------+-----------+-------------------+-------+
| Interface       | Neighbor               | Neighbor  | Neighbor  | Time in           | Flaps |
| Name            | Name                   | System ID | State     | State             |       |
+-----------------+------------------------+-----------+-----------+-------------------+-------+
| veth-103a-1001c | leaf-1:veth-1001c-103a | 1001      | THREE_WAY | 0d 00h:00m:07.85s | 0     |
+-----------------+------------------------+-----------+-----------+-------------------+-------+
| veth-103b-1002c | leaf-2:veth-1002c-103b | 1002      | THREE_WAY | 0d 00h:00m:07.97s | 0     |
+-----------------+------------------------+-----------+-----------+-------------------+-------+
| veth-103c-1003c | leaf-3:veth-1003c-103c | 1003      | THREE_WAY | 0d 00h:00m:08.06s | 0     |
+-----------------+------------------------+-----------+-----------+-------------------+-------+
| veth-103d-1004c | leaf-4:veth-1004c-103d | 1004      | THREE_WAY | 0d 00h:00m:07.90s | 0     |
+-----------------+------------------------+-----------+-----------+-------------------+-------+
| veth-103e-1005c | leaf-5:veth-1005c-103e | 1005      | THREE_WAY | 0d 00h:00m:07.74s | 0     |
+-----------------+------------------------+-----------+-----------+-------------------+-------+
| veth-103f-1006c | leaf-6:veth-1006c-103f | 1006      | THREE_WAY | 0d 00h:00m:07.94s | 0     |
+-----------------+------------------------+-----------+-----------+-------------------+-------+
| veth-103g-1007c | leaf-7:veth-1007c-103g | 1007      | THREE_WAY | 0d 00h:00m:07.82s | 0     |
+-----------------+------------------------+-----------+-----------+-------------------+-------+
| veth-103h-1008c | leaf-8:veth-1008c-103h | 1008      | THREE_WAY | 0d 00h:00m:07.80s | 0     |
+-----------------+------------------------+-----------+-----------+-------------------+-------+

spine3> 

Meta-Configuration File Syntax

Just like the configuration file, the meta-configuration file is a YAML file.

The syntax of the meta-configuration YAML file is as follows (! indicates an element is mandatory):

  chaos: {
    event-interval: <float>,
    max-concurrent-events: <integer>,
    nr-link-events: <integer>,
    nr-node-events: <integer>
  }
  inter-plane-east-west-links: <boolean>
  leafs: {
    nr-ipv4-loopbacks: <integer>
  } 
! nr-leaf-nodes-per-pod: <integer>
  nr-planes: <integer>
  nr-pods: <integer>
! nr-spine-nodes-per-pod: <integer>
  nr-superspine-nodes: <integer>
  spines: {
    nr-ipv4-loopbacks: <integer>
  } 
  superspines: {
    nr-ipv4-loopbacks: <integer>
  } 

chaos

Element chaos
Value Dictionary with sub-elements: event-interval, max-concurrent-events, nr-link-event, nr-node-events
Level Top-level
Presence Optional
Meaning Defines the parameters for chaos testing

If the chaos is present, then the config_generator also outputs the chaos.sh script for chaos testing.

The chaos element is only supported in namespace per node mode (command-line option --netns-per-node)

See the IETF 104 Hackathon Presentation: Chaos Monkey Testing (PDF) for more details on chaos testing.

Example:

nr-pods: 2
nr-leaf-nodes-per-pod: 2
nr-spine-nodes-per-pod: 2
nr-superspine-nodes: 2
chaos: {
  event-interval: 5.0,
  max-concurrent-events: 3,
  nr-link-events: 10,
  nr-node-events: 5
}

event-interval

Element event-interval
Value Float, minimum value 0.0
Level chaos
Presence Optional, default value 3.0
Meaning The interval, in seconds, between events in the chaos testing script

Example:

nr-pods: 2
nr-leaf-nodes-per-pod: 2
nr-spine-nodes-per-pod: 2
nr-superspine-nodes: 2
chaos: {
  event-interval: 5.0,
  max-concurrent-events: 3,
  nr-link-events: 10,
  nr-node-events: 5
}
Element inter-plane-east-west-links
Value Boolean
Level Top-level
Presence Optional, default value True. Only relevant if nr-planes > 1; ignored if nr-planes = 1
Meaning True if inter-plane east-west connections are present in the superspine; false if not.

Example:

nr-pods: 3
nr-leaf-nodes-per-pod: 4
nr-spine-nodes-per-pod: 4
nr-superspine-nodes: 4
nr-planes: 2
inter-plane-east-west-links: false

leafs

Element leafs
Value Dictionary with sub-elements: nr-ipv4-loopbacks
Level Top-level
Presence Optional
Meaning Defines the characteristics of each leaf node in the topology

Example:

nr-pods: 2
nr-leaf-nodes-per-pod: 2
leafs: {
  nr-ipv4-loopbacks: 2
}
nr-spine-nodes-per-pod: 2
nr-superspine-nodes: 2

max-concurrent-events

Element max-concurrent-events
Value Integer, minimum value 1
Level chaos
Presence Optional, default value 5
Meaning The maximum number of concurrent events in the chaos testing script

The chaos testing script chaos.sh generates a sequence of random events. Each event breaks something (e.g. a link or a node) and some time later repairs it. Between the time that something is broken and the time that it is repaired, the failure is deemed to be “active”. The max-concurrent-events specifies the maximum number of concurrently active failures. For example, if you set max-concurrent-events to 1, then you are testing that the network continues to function correctly in the face of any single failure.

Example:

nr-pods: 2
nr-leaf-nodes-per-pod: 2
nr-spine-nodes-per-pod: 2
nr-superspine-nodes: 2
chaos: {
  event-interval: 5.0,
  max-concurrent-events: 3,
  nr-link-events: 10,
  nr-node-events: 5
}

nr-ipv4-loopbacks

Element nr-ipv4-loopbacks
Value Integer, minimum value 0
Level leafs, spines, superspines
Presence Optional, default value 1
Meaning The number of IPv4 loopback interfaces per leaf / spine / superspine node

Example:

nr-pods: 2
nr-leaf-nodes-per-pod: 2
nr-spine-nodes-per-pod: 2
spines: {
  nr-ipv4-loopbacks: 2
}
nr-superspine-nodes: 2

nr-leaf-nodes-per-pod

Element nr-leaf-nodes-per-pod
Value Integer, minimum value 1
Level Top-level
Presence Mandatory
Meaning The number of leaf nodes per POD

Example:

nr-leaf-nodes-per-pod: 8
nr-spine-nodes-per-pod: 8

nr-planes

Element nr-planes
Value Integer
Level Top-level
Presence Optional, default value 1
Meaning The number of planes in the topology

Example:

If nr-planes is set to 1 (the default value), the topology is a single plane topology where each spine node is each connected to each superspine node.

If nr-planes > 1, the topology is a multi-plane topology and the following restrictions apply:

nr-pods: 3
nr-superspine-nodes: 8
nr-leaf-nodes-per-pod: 4
nr-spine-nodes-per-pod: 4
nr-planes: 2
Element nr-link-events
Value Integer, minimum value 0
Level chaos
Presence Optional, default value 20
Meaning The number of link events in the chaos testing script

A link event can be one of the following (each event has a corresponding repair event):

Example:

nr-pods: 2
nr-leaf-nodes-per-pod: 2
nr-spine-nodes-per-pod: 2
nr-superspine-nodes: 2
chaos: {
  event-interval: 5.0,
  max-concurrent-events: 3,
  nr-link-events: 10,
  nr-node-events: 5
}

nr-node-events

Element nr-node-events
Value Integer, minimum value 0
Level chaos
Presence Optional, default value 5
Meaning The number of node events in the chaos testing script

A node event can be one of the following (each event has a corresponding repair event):

Example:

nr-pods: 2
nr-leaf-nodes-per-pod: 2
nr-spine-nodes-per-pod: 2
nr-superspine-nodes: 2
chaos: {
  event-interval: 5.0,
  max-concurrent-events: 3,
  nr-link-events: 10,
  nr-node-events: 5
}

nr-pods

Element nr-pods
Value Integer, minimum value 1
Level Top-level
Presence Optional, default value 1
Meaning The number of PODs

Example:

nr-pods: 2
nr-leaf-nodes-per-pod: 8
nr-spine-nodes-per-pod: 8

nr-spine-nodes-per-pod

Element nr-spine-nodes-per-pod
Value Integer, minimum value 1
Level Top-level
Presence Mandatory
Meaning The number of spine nodes per POD

Example:

nr-leaf-nodes-per-pod: 8
nr-spine-nodes-per-pod: 8

nr-superspine-nodes

Element nr-superspine-nodes
Value Integer, minimum value 1
Level Top-level
Presence If nr-pods is greater than 1, then nr-superspine-nodes is mandatory. If nr-pods equals 1, then nr-superspine-nodes must not be present.
Meaning The number of superspine nodes

Example:

nr-leaf-nodes-per-pod: 3
nr-spine-nodes-per-pod: 3
nr-pods: 2
nr-superspine-nodes: 4

spines

Element spines
Value Dictionary with sub-elements: nr-ipv4-loopbacks
Level Top-level
Presence Optional
Meaning Defines the characteristics of each spine node in the topology

Example:

nr-pods: 2
nr-leaf-nodes-per-pod: 2
nr-spine-nodes-per-pod: 2
spines: {
  nr-ipv4-loopbacks: 2
}
nr-superspine-nodes: 2

superspines

Element superspines
Value Dictionary with sub-elements: nr-ipv4-loopbacks
Level Top-level
Presence Optional
Meaning Defines the characteristics of each superspine node in the topology

Example:

nr-pods: 2
nr-leaf-nodes-per-pod: 2
nr-spine-nodes-per-pod: 2
nr-superspine-nodes: 2
superspines: {
  nr-ipv4-loopbacks: 2
}