Decoding IPFIX options using Go

The IPFIX (IP Flow Information Export) protocol provides an extensible standard for transmitting network flow data.

A key difference compared to the likes of sflow, is the template-based nature of data.

While very similar to NetFlow version 9, IPFIX enables variable length fields and vendor extensions. This makes the protocol suitable for different types of performance data, as desired by any vendor.

A recent project required some processing of IPFIX flow data, which this post will focus on.

TL;DR The full implementation can be found on Github

Parsing options

A number of IPFIX decoder implementations exist in Go, most are included in flow decoder implementations, rather than standalone libraries.

The best stand-alone library I could fix is, but this doesn’t support decoding options.

To better understand the scope of implementation and overall structure of IPFIX, a stand-alone decoder was implemented.

Data structure

To begin implementing our own decoder, we first need to understand the format of packets used in IPFIX.

We can use both the IANA field assignments and RFC to construct our base expectations.

At a high level, there is 1 common header then 3 different payload types

  • Data template
  • Options template
  • Data set

We are interested in the options template and data set (where we have a matching template ID, more on this later).

Decoding the header

As described in the RFC, we can expect 5 fields.

Followed by the message header we have a set identifier, which describes the message contents; for our purposes, we will use this as part of the header.

header := IpfixHeader{
	Version:        binary.BigEndian.Uint16(payload[0:2]),
	MessageLength:  binary.BigEndian.Uint16(payload[2:4]),
	ExportTime:     binary.BigEndian.Uint32(payload[4:8]),
	SequenceNumber: binary.BigEndian.Uint32(payload[8:12]),
	DomainId:       binary.BigEndian.Uint32(payload[12:16]),
	SetId:          binary.BigEndian.Uint16(payload[16:20]),

We are interested in any SetId that is 3 (options template) or >= 256 (data set).

Decoding the template

Before any data can be decoded, we must have a matching template.

Without the template, there is no way to know how the fields are mapped inside the data set.

Each template payload (SetId 2 or 3) has a header containing the ID and field counts.

template := OptionsTemplate{
  TemplateId:      binary.BigEndian.Uint16(payload[0:2]),
  FieldCount:      binary.BigEndian.Uint16(payload[2:4]),
  ScopeFieldCount: binary.BigEndian.Uint16(payload[4:6]),

Once again, using the RFC, we can determine the payload is a sequence of field separators.

The number of separators corresponds to the values in the header we just decoded.

The ordering of these fields is critical for us to maintain.

Note: Unlike a data template, the options template has a set of scope fields.

Decode the fields

Both scope fields and fields have the same structure, thus can be decoded using the same logic.

func decodeSingleTemplateField(payload []byte) (TemplateField, int) {
  tf := TemplateField{
    ElementId: binary.BigEndian.Uint16(payload[0:2]),
    Length:    binary.BigEndian.Uint16(payload[2:4]),

  if tf.ElementId > 0x8000 {
    tf.ElementId = tf.ElementId & 0x7fff
    tf.EnterpriseNumber = binary.BigEndian.Uint32(payload[0:4])
    return tf, 8

  return tf, 4

It’s then simply a case of decoding each field in sequence and storing them for later

// Get all scope entries
for i := template.ScopeFieldCount; i > 0; i-- {
  tf, cut := decodeSingleTemplateField(byteSlice)
  template.ScopeField = append(template.ScopeField, tf)

  if len(byteSlice) < cut {
  byteSlice = byteSlice[cut:]

// Get all field entries
for i := template.FieldCount - template.ScopeFieldCount; i > 0; i-- {
  tf, cut := decodeSingleTemplateField(byteSlice)
  template.Field = append(template.Field, tf)

  if len(byteSlice) < cut {
  byteSlice = byteSlice[cut:]

Cache the template

Now we have the template decoded, it is important to store it. The fields described in the template need to be used when decoding the data set, which we will look at next.

A simple way to store this is using the LRU cache implementation from Hashicorp,

All future lookups will be via the ID, so using this as the key make sense.

templateCache, err := lru.New(10240)
if err != nil {
  log.Fatalf("Failed to setup options template cache: %v", err)

templateCache.Add(template.Id, template)

Decoding the data set

Any set ID over 255 represents a data set, the set ID refers to the template we need to use when decoding the data set.

First, we need to ensure we have a matching template for this payload.

cacheEntry, ok := templateCache.Get(header.SetId)
if !ok {
  return nil, true
template := cacheEntry.(OptionsTemplate)

Once we have the template, it’s a case of decoding each option in sequence.

Again, both scope fields and fields can be decoded using the same logic.

Field decoding

The option decoding logic has 3 main tasks:

  • Read the correct length of bytes off the payload
  • Lookup the associated name of the field from the identifier
  • Cast the byte array into the correct data type for the identifier

The IANA field assignments accurately describe the field data we need to complete these tasks.

func decodeSingleOption(byteSlice []byte, field TemplateField, options Options) {
	// Check we have enough data
	if len(byteSlice) < int(field.Length) {

	// Handle each enterprise
	switch field.EnterpriseNumber {
	case 0:
		// Handle elements for enterprise 0
		switch field.ElementId {
		case 34:
			// samplingInterval
			options["samplingInterval"] = binary.BigEndian.Uint32(byteSlice[:int(field.Length)])
		case 36:
			// flowActiveTimeout
			options["flowActiveTimeout"] = binary.BigEndian.Uint16(byteSlice[:int(field.Length)])
		case 37:
			// flowIdleTimeout
			options["flowIdleTimeout"] = binary.BigEndian.Uint16(byteSlice[:int(field.Length)])
		case 41:
			// exportedMessageTotalCount
			options["exportedMessageTotalCount"] = binary.BigEndian.Uint64(byteSlice[:int(field.Length)])
		case 42:
			// exportedFlowRecordTotalCount
			options["exportedFlowRecordTotalCount"] = binary.BigEndian.Uint64(byteSlice[:int(field.Length)])
		case 130:
			// exporterIPv4Address
			options["exporterIPv4Address"] = net.IP(byteSlice[:int(field.Length)])
		case 131:
			// exporterIPv6Address
			options["exporterIPv6Address"] = net.IP(byteSlice[:int(field.Length)])
		case 144:
			// exportingProcessId
			options["exportingProcessId"] = binary.BigEndian.Uint32(byteSlice[:int(field.Length)])
		case 160:
			// systemInitTimeMilliseconds
			options["exportingProcessId"] = int64(binary.BigEndian.Uint64(byteSlice[:int(field.Length)]))
		case 214:
			// exportProtocolVersion
			options["exportProtocolVersion"] = uint8(byteSlice[0])
		case 215:
			// exportTransportProtocol
			options["exportTransportProtocol"] = uint8(byteSlice[0])

The order of fields in the data set is identical to the order in the template, so once again it’s just a case of looping over them.

// Read all scope field separators
for i := 0; i < len(template.ScopeField); i++ {
  decodeSingleOption(byteSlice, template.ScopeField[i], options)

  if len(byteSlice) < int(template.ScopeField[i].Length) {
  byteSlice = byteSlice[int(template.ScopeField[i].Length):]

// Read all field separators
for i := 0; i < len(template.Field); i++ {
  decodeSingleOption(byteSlice, template.Field[i], options)

  if len(byteSlice) < int(template.Field[i].Length) {
  byteSlice = byteSlice[int(template.Field[i].Length):]


We now have a subset of the IANA fields supported in our decoder.

Given a correct template and data payload, the result is a map of received options.

  exportedMessageTotalCount: 250
  exportedFlowRecordTotalCount: 10
  samplingInterval: 10
  flowIdleTimeout: 15
  exportingProcessId: 72
  exporterIPv6Address: ::
  flowActiveTimeout: 60
  exportProtocolVersion: 10
  exportTransportProtocol: 17


IPFIX is a highly flexible protocol with a relatively simple data format, allowing parsing to be easily implemented.

While the implementation’s boundary checking could be improved, the exercise of creating an actual implementation from documented implementation I would recommend to all.

You may also find that some vendors have interesting assumptions within their options handling, with many configuration knobs missing compared to data templates.

Having this functionality separated from upstream code proved to be fruitful, allowing certain options to be stored and distributed outside of normal flow collection.

The full implementation can be found on Github, with basic test cases added for each implemented field.

Decoding Arista EtherType headers with gopacket

As we previously discussed, using cheap switches to aggregate multiple tap sources gives you a lot of power.

However, given the multiple feeds, how can you measure timing information accurately 1 hop away?

Using hardware time stamping provides a highly accurate record of when packets were processed by devices, making it perfect for TAP aggregation.

Revisiting the 7150 platform

On the 7150 series hardware, time stamping is supported at line-rate using PTP.

You have two options for timestamp placement;

  • Replacement of the FCS (mac timestamp replace-fcs):
7150 timestamp replace fcs digraph "7150 timestamp replace fcs" { subgraph cluster { rankdir=LR; height=1.5; "Timestamp" [shape=square, height=1.5]; "Payload" [shape=square, height=1.5]; "IP Header" [shape=square, height=1.5]; "Ethernet Header" [shape=square, height=1.5]; } } 7150 timestamp replace fcs cluster Timestamp Timestamp Payload Payload IP Header IP Header Ethernet Header Ethernet Header
  • Appending of the timestamp (mac timestamp before-fcs):
7150 timestamp append digraph "7150 timestamp append" { subgraph cluster { rankdir=LR; height=1.5; "FCS" [shape=square, height=1.5]; "Timestamp" [shape=square, height=1.5]; "Payload" [shape=square, height=1.5]; "IP Header" [shape=square, height=1.5]; "Ethernet Header" [shape=square, height=1.5]; } } 7150 timestamp append cluster FCS FCS Timestamp Timestamp Payload Payload IP Header IP Header Ethernet Header Ethernet Header

The implementation of this is a little ‘quirky’.

Looking at the Timestamp value alone will not help you as it’s an internal ASIC counter on the switch, essentially providing the lower half of the timestamp.

To calculate the actual (Unix based) timestamp, another keyframe packet has to be processed and tracked (every ~6 seconds); providing the first half of the timestamp.

While possible to implement, the imposed state and skew calculations is a little unappealing.

A look into the 7500{E,R}/7280{E,R} series

On the newer platforms, Arista has moved away from using the keyframe setup and introduced a custom EtherType. Again, using hardware time stamping at line-rate & supporting PTP.

There is 3 possible time stamping modes on the 7500{E,R} & 7280{E,R} series switches:

  • 64-bit header timestamp; i.e., encapsulated in the L2 header
  • 48-bit header timestamp; i.e., encapsulated in the L2 header
  • 48-bit timestamp that replaces the Source MAC

We will focus on the first 2 options that use a customer EtherType inside the layer 2 header.

Note: All timestamps are captured upon packet ingress and stamped on packet egress.

A look into the packet format

Let’s compare a normal ethernet header:

ethernet header digraph "ethernet header" { subgraph cluster { rankdir=LR; height=1.5; "FCS" [shape=square, height=1.5]; "Payload" [shape=square, height=1.5]; "Length/Type" [shape=square, height=1.5]; "Src Address" [shape=square, height=1.5]; "Dst Address" [shape=square, height=1.5]; } } ethernet header cluster FCS FCS Payload Payload Length/Type Length/Type Src Address Src Address Dst Address Dst Address

To one with the customer EtherType inserted:

ethernet header extended digraph "ethernet header extended" { subgraph cluster { rankdir=LR; height=1.5; "FCS" [shape=square, height=1.5]; "Payload" [shape=square, height=1.5]; "Length/Type" [shape=square, height=1.5]; "Timestamp" [shape=square, height=1.2]; "Version" [shape=square, height=1.2]; "Sub-Type" [shape=square, height=1.2]; "EtherType" [shape=square, height=1.5]; "Src Address" [shape=square, height=1.5]; "Dst Address" [shape=square, height=1.5]; } } ethernet header extended cluster FCS FCS Payload Payload Length/Type Length/Type Timestamp Timestamp Version Version Sub-Type Sub-Type EtherType EtherType Src Address Src Address Dst Address Dst Address

Note: .1q payloads are also supported, with the EtherType coming after the Source Address

As you can see an extra 4 fields have been inserted into the header;

  • EtherType - 0xD28B - An identifier for AristaEtherType
  • Protocol sub-type - 0x1 - A sub-identifier for the AristaEtherType
  • Version - 0x10 or 0x20 - An identifier for either 64bit or 48bit
  • Timestamp - An IEEE 1588 time of day format

The timestamp is either 32 bits (seconds) followed by 32 bits (nanoseconds) or 16 bits (seconds) followed by 32 bits (nanoseconds) depending on the 64 or 48bit mode.


Enabling hardware timestamping on the platform is rather simple;

  • mac timestamp header enables timestamping on tool ports
  • mac timestamp header format <64bit | 48bit> sets the format of the timestamp
  • mac timestamp replace source-mac enables replacing the source mac address with the timestamp

There are some limitations to the time stamping support, notably;

  • Timestamping is done after packet processing, resulting in ~10ns of delay
  • 64bit timestamps may rollover inconsistency every 4 seconds causing jumps between packets

Decoding the packets

Now we’ve changed the Ethernet header, it requires a specific decoder to be able to process.

Without a specific decoder, it is no longer a valid Ethernet header as Length field contains a meaningless value.

Arista provides an LUA extension for Wireshark for this purpose.

Decoding custom EtherTypes in gopacket

gopacket has a very useful pcap interface, making it very easy to process data collected from TAP infrastructure.

Investigating the structure, it made sense to implement a custom layer to handle our EtherType.

After some experimentation, while this provided decoding of the timestamp data, it prevented further processing of the packets, resulting in the IP layer being inaccessible; this was complicated due to our now invalid Ethernet header.

A simple solution of extending the built-in EthernetType was called for.

// Copyright 2012 Google, Inc. All rights reserved.
// Copyright 2009-2011 Andreas Krennmair. All rights reserved.
// Use of this source code is governed by a BSD-style license
// that can be found in the LICENSE file in the root of the source
// tree.
package decoder

import (

// This layer has a two-byte protocol subtype of 0x1,
// a two-byte protocol version of 0x10 and
// an eight-byte UTC timestamp in IEEE 1588 time of format
// So that would be 12 bytes in totally we need to strip off right after the src mac
type AristaEtherType struct {
	ProtocolSubType      uint16
	ProtocolVersion      uint16
	TimestampSeconds     uint32
	TimestampNanoSeconds uint32

// AristaExtendedEthernet is the layer of a normal or Arista extended Ethernet frame headers.
// This is the same as layers.Ethernet, but may have AristaEtherType filled with data
type AristaExtendedEthernet struct {
	AristaEtherType AristaEtherType

func (eth *AristaExtendedEthernet) DecodeFromBytes(data []byte, df gopacket.DecodeFeedback) error {
	if len(data) < 14 {
		return errors.New("AristaExtendedEthernet packet too small")
	eth.DstMAC = net.HardwareAddr(data[0:6])
	eth.SrcMAC = net.HardwareAddr(data[6:12])

	// Arista places 12 bytes directly after the src mac, see AristaEtherType comments for structure
	// We handle both timestamped and non-timestamped frames here
	etherType := binary.BigEndian.Uint16(data[12:14])
	if len(data) >= 26 && etherType == 53899 {
		eth.AristaEtherType = AristaEtherType{
			ProtocolSubType:      binary.BigEndian.Uint16(data[14:16]),
			ProtocolVersion:      binary.BigEndian.Uint16(data[16:18]),
			TimestampSeconds:     binary.BigEndian.Uint32(data[18:22]),
			TimestampNanoSeconds: binary.BigEndian.Uint32(data[22:26]),
		eth.EthernetType = layers.EthernetType(binary.BigEndian.Uint16(data[26:28]))
		eth.BaseLayer = layers.BaseLayer{data[:28], data[28:]}
	} else {
		eth.EthernetType = layers.EthernetType(binary.BigEndian.Uint16(data[12:14]))
		eth.BaseLayer = layers.BaseLayer{data[:14], data[14:]}

	// Logic from the upstream Ethernet code
	if eth.EthernetType < 0x0600 {
		eth.Length = uint16(eth.EthernetType)
		eth.EthernetType = layers.EthernetTypeLLC
		if cmp := len(eth.Payload) - int(eth.Length); cmp < 0 {
		} else if cmp > 0 {
			eth.Payload = eth.Payload[:len(eth.Payload)-cmp]
	return nil

// Required methods to be a valid layer
func (e *AristaExtendedEthernet) LinkFlow() gopacket.Flow {
	return gopacket.NewFlow(layers.EndpointMAC, e.SrcMAC, e.DstMAC)

func (e *AristaExtendedEthernet) LayerType() gopacket.LayerType {
  return gopacket.LayerType(17)

func (eth *AristaExtendedEthernet) NextLayerType() gopacket.LayerType {
	return eth.EthernetType.LayerType()

// Public function
func DecodeAristaExtendedEthernet(data []byte, p gopacket.PacketBuilder) error {
	eth := &AristaExtendedEthernet{}
	err := eth.DecodeFromBytes(data, p)
	if err != nil {
		return err
	return p.NextDecoder(eth.EthernetType)

Now we have the custom decoder, we just need to register it with gopacket. This makes gopacket use our decoder implementation rather than the built-in Ethernet one.

import (

func init() {
  layers.LinkTypeMetadata[layers.LinkTypeEthernet] = layers.EnumMetadata{
    DecodeWith: gopacket.DecodeFunc(DecodeAristaExtendedEthernet),
    Name:       "AristaExtendedEthernet",

The custom fields are now accessible on the Ethernet layer, alongside the other fields.

layer := packet.Layer(layers.LayerTypeEthernet)
if layer != nil {
  ethernetLayer := layer.(*decoder.AristaExtendedEthernet)
  if ethernetLayer.AristaEtherType.ProtocolSubType != 0 {
    timestamp, err := strconv.ParseFloat(fmt.Sprintf(

A similar decoder has been successfully tested with 500k/s packets per second.


Using standard protocols and cheap hardware we can build powerful performance analysis applications.

This work was inspired by Ruru, providing the foundations for performance monitoring and insights of heavily asymmetric & distributed traffic flows.

A look at traffic encryption options

Given a post highlighting the cost effectiveness of deploying network taps, it would make sense to look at the other side; encryption.


It is commonly accepted to use TLS when accessing services over the internet, whether they are based on HTTP, SMTP, IMAP, POP, FTP or any number of other protocols.

It is also commonly accepted to terminate those TLS connections on the edge, handling all internal communications in plain text. This is for a number of reasons around scalability, performance and trust.

As technology stacks have matured, a number of security standards have been created, including those for card handling (PCI DDS); many of these still contain phrasing such as ‘encrypt transmission of cardholder data across open, public networks’, with the definitions being open to interpretation.

Pause for a moment and consider if these scenarios are ‘across open, public networks’:

  • A point to point circuit provided over an external ‘dark fibre’ (DWDM or similar) network
  • A point to point wireless link between 2 buildings provided by a 3rd party
  • Cross connections between 2 suites within a datacenter, via the meet me room

I imagine most people would argue these are private:

  • All services are dedicated to you
  • Traffic is isolated from other customers
  • You control both ends of the connection

However, there are also risks, as they all pass through physical assets you don’t control:

  • Traffic interception; it has been widely published that data from the likes of Google has been intercepted and used for profiling activities
  • Network access; As with a physical intrusion to your rack/office, it may be possible to gain network access via a cross-connect, or inter-site link, depending on topology
  • Redundancy; A targeted attack on physical infrastructure could place your business operations at risk

Thankfully, most providers and many ISO standards have well defined physical access controls, which limit the possibilities of the above however that isn’t very effective against a nation-state, or a cyber-based attack on a provider.

Many businesses have a wealth of information useful to a nation-state, from habits and preferences to medical or travel data. It might be paranoia until they’re out to get you.

Ultimately this comes down to risk management and if you want to be ‘compliant’ or ‘secure’ in regards to your customer’s data.

Assuming we want to encrypt data end-to-end, let’s look at the technologies available.


As briefly noted above, the standard for encryption in the public network space is Transport Layer Security (TLS).

There are multiple implementations of TLS, with 1.2 currently being the standard (1.3 is in draft), the version and associated cryptographic ciphers are usually associated with the support requirements; many older browsers and SSL libraries don’t support the most secure choices.

A good starting point is the excellent site, highlighting secure configs for most platforms.

The results should then be checked, either using openssl s_client, or similar.

Generally, user-facing TLS is simple to deploy, with the potential for small compatibility issues (including breaking certain browsers).

The direction for Google Chrome and others is to start displaying HTTP sites in the same manner as invalid SSL is currently shown (red bars or similar), so it’s highly advisable even if you don’t transmit any ‘sensitive info’ (sensitive here includes tracking data, such as cookies).

What about the cost? Platforms such as Lets Encrypt provide free SSL certs, trusted by all major browsers. For dedicated extended validation certificates, a few hundred dollars is a small cost for most online businesses.

TLS internally

Historically concerns about the performance of TLS have stunted the deployment internally, modern versions of the libraries combined with the current generations of CPUs mean TLS is not slow! (mostly).

The excellent goes into detail about the current state of TLS performance. The key takeaway is, when configured correctly, TLS at the scale of Facebook and Google performs fast enough with minimal CPU overhead.

Using the most secure cipher suites (ECDHE) are a little more costly, but with mitigations in place (HTTP keepalives, session resumption etc), the performance overhead is negligible.

Depending on your environment, you may purchase or use CA-signed certificates as is the case with external traffic however at a certain scale an internal certificate authority makes sense.

There is a certain level of complexity in deploying and maintaining a secure internal certificate authority and many tools exist to help with this.

Another advantage of an internal CA is being able to use certificate-based authentication for clients, enabling devices to prove their identity.


Depending on your environment, encrypting all inter-device traffic with TLS may be possible.

Given a small load balanced LAMP stack this should be relatively easy:

  • TLS from the user to the load balancer
  • TLS from the load balancer to the web server
  • TLS from PHP to MySQL

However, the number of services can easily spiral, given the example above we could easily have:

  • SMTP relays
  • Memcache/Redis caching
  • NFS based storage
  • Monitoring agents

You may also have services which don’t support TLS:

  • Windows Distributed File Shares
  • Access control / CCTV systems designed for closed networks

There are 2 areas to consider here:

  • Traffic passing over your network (Physical access restrictions in place)
  • Traffic passing over external infrastructure

For the first case:

  • I strongly suggest to deploy TLS where possible, at worst, it is another layer of defence should a malicious device get into the network.
  • For protocols lacking encryption support, their risk is likely low
    • If their risk is not low, I’d suggest re-evaluating the technology choice
    • Inline encryption is possible but complicated to scale at this level

For the second, we have a number of options described below.

Layer 1 encryption

There are a number of ‘black box’ solutions, which sit in-line to the network.

The general principle is un-encrypted data comes in one end, encrypted data comes out the other; the reverse then happens to give you un-encrypted data on the other end.

layer 1 encryption digraph "layer 1 encryption" { rankdir=LR; subgraph cluster_SiteA { label = "Site A"; color = black; "Router A"; } subgraph cluster_EncryptedNet { label = "Encrypted Network"; color = red; "Device A"; "Device B"; } subgraph cluster_SiteB { label = "Site B"; color = black; "Router B"; } "Router A" -> "Device A" "Device A" -> "Device B" "Device B" -> "Router B" } layer 1 encryption cluster_SiteA Site A cluster_EncryptedNet Encrypted Network cluster_SiteB Site B Router A Router A Device A Device A Router A->Device A Device B Device B Device A->Device B Router B Router B Device B->Router B

These are generally expensive appliances, licensed by port or bandwidth capability. They are also generally completely closed boxes, operating strictly at layer 1.

Deployed across DWDM or similar networks, these devices should ‘just work’ and provide full encryption (layer 1 to 7). They are limited to ‘point to point’ links.

A side effect of the point to point encryption is the prevention of unauthorised traffic.

IEEE 802.1AE (MacSec)

A more open and industry standard approach would be to deploy MacSec.

MacSec provides encryption of the layer 2 header and up, but leaves the src/dest mac addresses exposed.

It operates similar to an Ethernet frame within a layer 2 network, with the packets containing 4 fields

macsec header digraph "macsec header" { subgraph cluster { rankdir=LR; height=1.5; "Destination MAC" [shape=square, height=1.5]; "Source MAC" [shape=square, height=1.5]; "Security TAG" [shape=square, height=1.5]; "Encrypted Data" [shape=square, height=1.5]; "ICV" [shape=square, height=1.5]; } } macsec header cluster Destination MAC Destination MAC Source MAC Source MAC Security TAG Security TAG Encrypted Data Encrypted Data ICV ICV

Any layer 2 data outside of the mac addresses (VLAN tag, LLDP etc) is contained within the encrypted data.

The security tag and ICV are used internally for MacSec, with the mac addresses being used for forwarding.

There is a hardware dependency associated with MacSec, as the encryption is done in hardware to achieve line rate speeds. This varies between vendors but can be in the form of dedicated line cards or whole products.

It is possible to offload the encryption to MacSec capable switches, allowing routers and line cards to remain, with the switch sitting inline.

macsec encryption digraph "macsec encryption" { rankdir=LR; subgraph cluster_SiteA { label = "Site A"; color = black; "Router A"; } subgraph cluster_EncryptedNet { label = "Encrypted Network"; color = red; "Switch A"; "Switch B"; } subgraph cluster_SiteB { label = "Site B"; color = black; "Router B"; } "Router A" -> "Switch A" "Switch A" -> "Switch B" "Switch B" -> "Router B" } macsec encryption cluster_SiteA Site A cluster_EncryptedNet Encrypted Network cluster_SiteB Site B Router A Router A Switch A Switch A Router A->Switch A Switch B Switch B Switch A->Switch B Router B Router B Switch B->Router B

It may not be desirable to put a layer 2 device in your path, though it is likely that the path will already be using BFD or similar to account for any provider interruptions, which don’t result in an interface flap.

There are also some implementation considerations:

  • Additional header size needs to be accounted for in downstream MTUs
  • Certain providers filter layer 2 traffic, they may filter the MacSec control messages!

As with Layer 1 encryption, this prevents unauthorised traffic entering the network, as well as protecting against interception.


It may be desirable in some cases to form a software-based VPN mesh over your existing network, providing encryption between 2 or more points.

This could be in the form of a single IPsec tunnel, or a complex hub-spoke DMVPN network. These could be deployed on dedicated devices or end-user devices.

For high traffic applications, these approaches are likely not applicable, due to line rate speeds being desirable, but in branch or remote worker applications they can be powerful options in your toolbox.

vpn mesh encryption digraph "vpn mesh encryption" { rankdir=LR; "Device A" -> "Device B" "Device A" -> "Device C" "Device A" -> "Device D" "Device A" -> "Device E" "Device A" -> "Device F" "Device B" -> "Device C" "Device B" -> "Device D" "Device B" -> "Device E" "Device B" -> "Device F" "Device C" -> "Device D" "Device C" -> "Device E" "Device C" -> "Device F" "Device D" -> "Device E" "Device D" -> "Device F" "Device E" -> "Device F" } vpn mesh encryption Device A Device A Device B Device B Device A->Device B Device C Device C Device A->Device C Device D Device D Device A->Device D Device E Device E Device A->Device E Device F Device F Device A->Device F Device B->Device C Device B->Device D Device B->Device E Device B->Device F Device C->Device D Device C->Device E Device C->Device F Device D->Device E Device D->Device F Device E->Device F


There is not a 1 solution fits all. It is very dependent upon your environment and more specifically the applications within it.

My advice is to look at the risk in each area, design an appropriate solution and test it.

A targeted approach keeps things simple to start, but personally, I look to provide end-to-end encryption everywhere… trusting no one.

If you have no clear areas of risk, start with everywhere you touch the outside world, either via a public interface or provider managed services.

A look at low-cost tap aggregation

A while ago, I had a project that required capturing traffic from a number of sources, thus an adventure into possible solutions was born.

Why do we want to capture traffic?

There are numerous reasons to capture traffic including:

  • Network troubleshooting
  • Traffic monitoring (including intrusion detection)
  • Legal requests/requirements

Ultimately it’s about getting visibility, either for security or operations.

Our goal is given any path, to be able to replicate the traffic, with minimal impact.

How can we capture traffic?

There are generally 3 different methods for capturing traffic, each with their own complexities.

From a target device

At a basic level, a device can capture traffic in 2 ways:

  • ‘CPU bound’; traffic that has been brought up the network stack and is ‘destined’ for this device
  • Raw sockets; ‘raw’ network traffic that the device receives on a network interface.

CPU bound traffic can be efficient to capture if filtered appropriately. When dealing with high traffic volumes processing traffic can take critical resources away from your business applications.

Raw sockets are generally very limited, hub-based topologies are not widely used, limiting any traffic to broadcast for the servers subnet, or targeted (CPU bound) traffic (multicast/unicast).

Generally, to be usable by an analyser the traffic would need to be encapsulated and transmitted, using further resources on the device.

Span a switch

This is similar to inline, but I’ve separated it here due to implementation differences.

SPAN’s generally come in 3 forms:

  • SPAN - take traffic from port A, mirror to port B, on the same device
  • RSPAN - take traffic from port A, mirror to port B, on a remote device (over layer 2)
  • ERSPAN - take traffic from port A, mirror to port B, on a remote device (over layer 3)

There are a number of downsides:

  • Vendors have varying levels of support;
    • RSPAN on a Juniper EX series switch doesn’t work over an AE
    • Tricks can be used, for example, spanning into a GRE tunnel to accomplish ERSPAN, this becomes hardware dependent though
  • CPU generated (ICMP/ARP/LLDP/BPDU etc) packets generally do not get mirrored
  • Invalid packets will not be seen (those dropped due to checksum errors for example)
  • CPU usage can increase drastically due to extra processing requirements
  • As we’re spanning in an L2 domain arp table churn can happen if you’re not careful

However, if you need insight into a switched domain and don’t want to inline-tap every cable, it might work for you.


Inline-taps come in many different flavours, supporting different media types. At a basic level, there are 2 main differences.

Passive tap

  • Require no power, pass the original signal onto the output
  • Reduce the output power by a known ratio (important when dealing with fibre)
  • No monitoring data or other insights possible
  • Very failure resistant
Logical operation
passive tap digraph "passive tap" { subgraph cluster_Input { label = "Input Port"; color = black; "Input TX"; "Input RX"; } subgraph cluster_Output { label = "Output Port"; color = black; "Output TX"; "Output RX"; } subgraph cluster_Monitor { label = "Monitor Port"; color = black; "Monitor TX2"; "Monitor TX1"; } "Input TX" -> "Output RX" "Input TX" -> "Monitor TX1" "Output TX" -> "Input RX" "Output TX" -> "Monitor TX2" } passive tap cluster_Input Input Port cluster_Output Output Port cluster_Monitor Monitor Port Input TX Input TX Output RX Output RX Input TX->Output RX Monitor TX1 Monitor TX1 Input TX->Monitor TX1 Input RX Input RX Output TX Output TX Output TX->Input RX Monitor TX2 Monitor TX2 Output TX->Monitor TX2

There are different technologies for mirroring the payload when using copper, these are generally resistor based, for fibre they’re either thin film or fused biconical taper based.

Note: Thin film is generally preferred for 40G+ links, due to their lower loss rate caused by more even light distribution.

The concept for all of them is the same:

  • Given an input of 100%
  • Bleed off x% of the signal to the mirror port
  • Pass the remaining 100-x% through to the output port

A key consideration when using fibre is the ‘split ratio’ aka how much light to bleed off, both the monitor and the output interfaces need enough light for the optics on the other end.

Active tap

  • They require power and re-generate the output signals.
  • Monitoring data can be provided (light levels etc)
  • When using fibre there are no split ratio considerations
Logical operation
active tap digraph "active tap" { subgraph cluster_Input { label = "Input Port"; color = black; "Input TX"; "Input RX"; } subgraph cluster_Output { label = "Output Port"; color = black; "Output TX"; "Output RX"; } subgraph cluster_Monitor { label = "Monitor Port"; color = black; "Monitor TX2"; "Monitor TX1"; } subgraph cluster_Processor { color = black; shape = square; "Processor"; } "Processor" -> "Input TX" "Input RX" -> "Processor" "Processor" -> "Output TX" "Output RX" -> "Processor" "Processor" -> "Monitor TX1" "Processor" -> "Monitor TX2" } active tap cluster_Input Input Port cluster_Output Output Port cluster_Monitor Monitor Port cluster_Processor Input TX Input TX Input RX Input RX Processor Processor Input RX->Processor Output TX Output TX Output RX Output RX Output RX->Processor Monitor TX2 Monitor TX2 Monitor TX1 Monitor TX1 Processor->Input TX Processor->Output TX Processor->Monitor TX2 Processor->Monitor TX1

The internal complexity varies, but the principle is:

  • Given an input of X
  • Generate the payload of X into Y and Z
  • Transmit Y to the monitor interface
  • Transmit Z to the output interface

It is possible to buy active taps, with the capability to ‘fail open’ meaning in the event of a power failure traffic will continue to flow.

I still prefer to use passive taps, which should be as resilient as a fibre patch panel.

Why do we want to aggregate it?

In the most simplistic deployment, we can simply send from the source to the destination:

one to one tap digraph "one to one tap" { rankdir=LR; "TAP" -> "Analyser" } one to one tap TAP TAP Analyser Analyser TAP->Analyser

However, there are a number of reasons to have an aggregation step in the middle:

  • Number of ingress points
    • Aggregation of smaller capture points into larger interfaces
    • Reduction in rack space required for analysers, servers etc
    • Strategic aggregation to reduce physical requirements (fibre, rack space)
  • Multiple destinations
    • Apply filtering logic to save on analyser licensing
    • Send traffic to security appliances and network monitoring devices
  • Apply software logic to capture rules
  • Support multiple media types
    • Provide longer reach for copper-based taps
    • SMF, MMF, XFP, QSFP, Copper support
  • Single view of traffic
    • Certain DPI/IDS appliances require full flows, difficult in ECMP networks
    • I don’t recommend this due to the associated scaling issues

Downsides to having an aggregation step:

  • Potential congestion issues (we’re taking raw traffic, limited by the sender)
  • Single point of failure
    • This could be mitigated using layer 1 fibre switches or similar
    • It’s for monitoring traffic, so uptime is likely not as critical
  • Cost

Historically TAP aggregation has been a very expensive game, around 2-4k a port!

Thankfully Arista changed that with their 7150 series switch, which supports a tap mode as well as a ‘DANZ’ software suite for loss/latency monitoring, packet filtering and mirroring.

The DANZ suite is now available on the 7150, 7280E and 7500E series switches from Arista, opening up options from 1/2U fixed form to 7/11U chassi based deployments.

The 7150 series gives some nice features:

  • Port density; up to 64 10G ports
  • LANZ+ features for micro-burst analysis
  • PTP support for packet time stamping
  • ~350ns latency for all packets
  • Multi-port mirroring
  • Hitless (ISSU) upgrades
  • A ‘normal’ Linux environment and Python based toolset
  • You configure it like any other switch!

Let’s implement a solution!

We’ll keep it simple with 1 aggregation point, 6 inputs and 2 outputs. In reality, this could be multiple levels of aggregation.


  • 2 x Passive taps
  • 2 x SPANs
  • 2 x Active taps

Logical Groups:

  • Group 1 - Internet
  • Group 2 - Corporate
  • Group 3 - Regional


  • Bro Network Security Monitor (Internet + Regional only)
  • Server (Corporate only)

Logical Diagram

active tap digraph "active tap" { subgraph cluster_PassiveTap { label = "Passive Optical Tap"; color = black; "Transit Provider 1"; "Transit Provider 2"; } subgraph cluster_Corporate1 { label = "Corporate DMZ"; color = black; "Corporate Spine 1"; } subgraph cluster_Corporate2 { label = "Corporate DMZ"; color = black; "Corporate Spine 2"; } subgraph cluster_Regional { label = "Active TAP"; color = black; "Wan Provider 1"; "Wan Provider 2"; } subgraph cluster_7150 { color = black; shape = square; "Tap Switch"; } subgraph cluster_Bro { color = black; shape = square; "Bro Network Security Monitor"; } subgraph cluster_Server { color = black; shape = square; "Secret Server"; } "Transit Provider 1" -> "Tap Switch" "Transit Provider 2" -> "Tap Switch" "Corporate Spine 1" -> "Tap Switch" "Corporate Spine 2" -> "Tap Switch" "Wan Provider 1" -> "Tap Switch" "Wan Provider 2" -> "Tap Switch" "Tap Switch" -> "Bro Network Security Monitor" "Tap Switch" -> "Secret Server" } active tap cluster_PassiveTap Passive Optical Tap cluster_Corporate1 Corporate DMZ cluster_Corporate2 Corporate DMZ cluster_Regional Active TAP cluster_7150 cluster_Bro cluster_Server Transit Provider 1 Transit Provider 1 Tap Switch Tap Switch Transit Provider 1->Tap Switch Transit Provider 2 Transit Provider 2 Transit Provider 2->Tap Switch Corporate Spine 1 Corporate Spine 1 Corporate Spine 1->Tap Switch Corporate Spine 2 Corporate Spine 2 Corporate Spine 2->Tap Switch Wan Provider 1 Wan Provider 1 Wan Provider 1->Tap Switch Wan Provider 2 Wan Provider 2 Wan Provider 2->Tap Switch Bro Network Security Monitor Bro Network Security Monitor Tap Switch->Bro Network Security Monitor Secret Server Secret Server Tap Switch->Secret Server

Switch configuration

The switch config is where we wire everything together, there are 3 key concepts:

  • Tap - input
  • Tool - output
  • Tap Group - logical grouping of TAP ports

The configuration is quite simple and well documented.

First, let’s put the switch into tap mode

switch#configure terminal
switch(config)#tap aggregation
switch(config-tap-agg)#mode exclusive

This will place all switch ports into error disabled and enable tap/tool ports.

Next, define our tap ports

switch(config)#interface ethernet 1
switch(config-if-Et1)#description Transit Provider 1
switch(config-if-Et1)#switchport mode tap
switch(config-if-Et1)#switchport tool group INTERNET

switch(config)#interface ethernet 2
switch(config-if-Et2)#description Transit Provider 2
switch(config-if-Et2)#switchport mode tap
switch(config-if-Et2)#switchport tool group INTERNET

switch(config)#interface ethernet 3
switch(config-if-Et3)#description Corporate Spine 1
switch(config-if-Et3)#switchport mode tap
switch(config-if-Et3)#switchport tool group CORPORATE

switch(config)#interface ethernet 4
switch(config-if-Et4)#description Corporate Spine 2
switch(config-if-Et4)#switchport mode tap
switch(config-if-Et4)#switchport tool group CORPORATE

switch(config)#interface ethernet 5
switch(config-if-Et5)#description Wan Provider 1
switch(config-if-Et5)#switchport mode tap
switch(config-if-Et5)#switchport tool group WAN

switch(config)#interface ethernet 6
switch(config-if-Et6)#description Wan Provider 2
switch(config-if-Et6)#switchport mode tap
switch(config-if-Et6)#switchport tool group WAN

We now have 3 groups with 3 interfaces in each.

Finally, map those groups onto our outputs.

switch(config)#interface ethernet 10
switch(config-if-Et10)#description Bro Network Security Monitor
switch(config-if-Et10)#switchport mode tool
switch(config-if-Et10)#switchport tool group set INTERNET WAN

switch(config)#interface ethernet 11
switch(config-if-Et11)#description Secret Server
switch(config-if-Et11)#switchport mode tool
switch(config-if-Et11)#switchport tool group set CORPORATE

Et10 + Et11 will now receive any traffic sent to their relevant groups.

Advanced features

In the demo, we have a static input -> output allocation in real life this can be software controlled or filter based.

We could also truncate packets to look at only their headers, potentially useful for encrypted traffic.

A simple traffic steering example is as below:

  • For traffic coming into Eth1
  • Match traffic targeting
  • Send to tap group GOOGLE_DNS
  • Send tap group GOOGLE_DNS to Eth20
switch(config)#ip access-list ACL_GOOGLE_DNS
switch(config-acl-ACL_GOOGLE_DNS)#permit ip any

switch(config)#class-map type tapagg match-any TAP_CLASS_MAP
switch(config-cmap-TAP_CLASS_MAP)#match ip access-group ACL_GOOGLE_DNS

switch(config)#policy-map type tapagg TAP_POLICY
switch(config-pmap-TAP_POLICY)#class TAP_CLASS_MAP
switch(config-pmap-TAP_POLICY-TAP_CLASS_MAP)#set aggregation-group GOOGLE_DNS

switch(config)#interface ethernet 20
switch(config-if-Et20)#description Magic Box
switch(config-if-Et20)#switchport mode tool
switch(config-if-Et20)#switchport tool group set GOOGLE_DNS

switch(config)#interface ethernet1
switch(config-if-Et1)#service-policy type tapagg input TAP_CLASS_MAP

For software-based control, Arista provides a powerful HTTP API as well as XMPP client support and ‘on-device’ APIs. The Python eAPI client can be found on GitHub, with some examples.


It is cost effective to deploy tap infrastructure where required. The offering from Arista is very powerful, allowing flexibility to meet any number of requirements.

With a number of integration options (JSON API, Python API, XMPP client), having dynamic tapping capabilities is a nice spanner to have in your toolbox.

And the cost? Basically, about the same as a 10G switch with routing functionality.

Naturally, the cost is varied depending on physical requirements (fibre/copper runs), port speeds (1/10/25/40/100G) and port count; this is applicable to any switch or other tap based deployment.


I would advise any tap deployments to be carefully planned, certain topologies, such as multi-stage clos networks do not make good tap targets, both due to the number of links involved and the resulting bandwidth requirements for the TAP infrastructure.

Depending on your goals, tapping at natural points of congestion, such as the ingress/egress points for your high performance/bandwidth network segments (transit links, firewalls etc), will likely provide highly useful information at a vastly reduce capture complexity.

ClueBot Wikimedia Labs Setup

A few years ago (was it really that long?!), ClueBot III & ClueBot NG where moved into Wikimedia Labs from personal/community servers on Cluenet.

A while later, they were migrated again from Wikimedia Labs into Wikimedia Tool Labs, providing more resources via the Open Grid Manager cluster as well as web services and other shared community things.

Recently due to Ubuntu Precise LTS hitting EOL, the Tool Labs containers where migrated to run under Ubuntu Trusty LTS.

During the latest migration, the tool accounts where re-created from scratch. This post will outline how things are configured for a point of reference in the future.



  • tools.cluebot - Legacy account - only a web service redirect is running
  • tools.cluebot3 - Dedicated account for ClueBot III
  • tools.cluebotng - Account for all things related to ClueBot NG


  • s51109__cb - Legacy database, no longer updated
  • s52585__cb - Migrated database, now master

Key Directories/Files:



Hosted externally

  • / - Scripts to check last edit time + email alerts

ClueBot III


This is a very simple bot, relying on a config file only.

The server-side setup, assuming a clean account can be done following the below (locally):

  • pip install fabric
  • git clone
  • cd cluebot3
  • fab init

A manual setup is required to create the config file.

This should be created as ~/cluebot3/cluebot3.config.php under the tool account, containing the below:

	$owner = 'Cobi';
	$user = 'ClueBot III';
	$pass = 'Clearly this is not the actual password';
	$status = 'rw';
	$maxlag = 2;
	$maxlagkeepgoing = true;

In your local git clone, you can now do a full deploy (this starts the bot):

  • fab deploy

Things should be running, check the job status/logs under the tool account to confirm this.

tools.cluebot3@tools-bastion-03:~$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
xxxxxxx 0.30169 cluebot3   tools.cluebo r     03/04/2017 10:23:33 continuous@tools-exec-xxxx.eqi     1

tools.cluebot3@tools-bastion-03:~$ tail -f ~/logs/cluebot3-2017-03-05.log
cluebot3.INFO: doarchive(xxxxxxxxxxxx) [] []

You should also see edits from the bot after a while (a number of index pages need to be checked first).


The main bot log is normally somewhat insightful. Sometimes the bot will die due to memory usage when processing large pages, this generally isn’t seen in the logs and is hard to replicate running manually; looking for restarts is an indicator.

ClueBot NG

This bot is slightly more complicated and has numerous parts. The main part is fully scripted, but a number of (non-obvious) configs need to be in place.


This is not a depiction of request flow, but service dependencies.

layer 1 encryption digraph "layer 1 encryption" { "Wikipedia IRC RC Feed" -> "Main BOT" [dir=back]; "Main BOT" -> {"Wikimedia Labs Tools Database", "Wikipedia Database Replica", "ANN Core", "Wikimedia API", "IRC Relay"} "Review Interface" -> {"Wikimedia Labs Tools Database", "Wikimedia API", "IRC Relay"} "IRC Relay" -> "ClueNet IRC"; "ClueNet IRC" -> "Huggle / External Tools" [dir=back]; } layer 1 encryption Wikipedia IRC RC Feed Wikipedia IRC RC Feed Main BOT Main BOT Wikipedia IRC RC Feed->Main BOT Wikimedia Labs Tools Database Wikimedia Labs Tools Database Main BOT->Wikimedia Labs Tools Database Wikipedia Database Replica Wikipedia Database Replica Main BOT->Wikipedia Database Replica ANN Core ANN Core Main BOT->ANN Core Wikimedia API Wikimedia API Main BOT->Wikimedia API IRC Relay IRC Relay Main BOT->IRC Relay ClueNet IRC ClueNet IRC IRC Relay->ClueNet IRC Review Interface Review Interface Review Interface->Wikimedia Labs Tools Database Review Interface->Wikimedia API Review Interface->IRC Relay Huggle / External Tools Huggle / External Tools ClueNet IRC->Huggle / External Tools

Critical services for basic bot functionality include:

  • Wikipedia API (For downloading changes + reverts)
  • Tools DB (For creating vandalism IDs + recording action)
  • Wikipedia DB Replicas (up to date) (For fetching extra metadata)
  • Wikipedia IRC RC Feed (For the change feed)
  • Main Bot (Processor)
  • Core (For edit scoring)


The server-side setup, assuming a clean account can be done following the below (locally):

  • pip install fabric
  • git clone
  • cd cluebotng
  • fab deploy

Config Files

A number of config files need to be created manually.


The Wikipedia ClueBot NG user password


namespace CluebotNG;

class Config
    public static $user = 'ClueBot NG';
    public static $pass = null;
    public static $status = 'auto';
    public static $angry = false;
    public static $owner = 'Cobi';
    public static $friends = 'ClueBot,DASHBotAV';
    public static $mw_mysql_host = 'enwiki.labsdb';
    public static $mw_mysql_port = 3306;
    public static $mw_mysql_user = 's52585';
    public static $mw_mysql_pass = 'a password that is actually real';
    public static $mw_mysql_db = 'enwiki_p';
    public static $legacy_mysql_host = 'tools-db';
    public static $legacy_mysql_port = 3306;
    public static $legacy_mysql_user = 's52585';
    public static $legacy_mysql_pass = 'a password that is actually real';
    public static $legacy_mysql_db = 's52585__cb';
    public static $cb_mysql_host = 'tools-db';
    public static $cb_mysql_port = 3306;
    public static $cb_mysql_user = 's52585';
    public static $cb_mysql_pass = 'a password that is actually real';
    public static $cb_mysql_db = 's52585__cb';
    public static $udpport = 3334;
    public static $coreport = 3565;
    public static $fork = true;
    public static $dry = false;
    public static $sentry_url = null;


	$dbHost = 'tools-db';
	$dbUser = 's52585';
	$dbPass = 'a password that is actually real';
	$dbSchema = 's52585__cb';
	$rcport = 3333;
	$recaptcha_pubkey = "something here";
	$recaptcha_privkey = "something here too";


exports.nick = 'CBNGRelay';
exports.server = '';
exports.extra = [
    'OPER antiflood This Is Not The One You Are Looking For',


Now the config files are in place, the bot should actually work.

In your local git clone, complete another deploy, to restart everything:

  • fab deploy

Bot Checks

First, check the job status
tools.cluebotng@tools-bastion-03:~$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
xxxxxxx 0.30159 lighttpd-c tools.cluebo r     03/04/2017 12:48:57 webgrid-lighttpd@tools-webgrid     1
xxxxxxx 0.30154 cbng_relay tools.cluebo r     03/04/2017 13:32:09 continuous@tools-exec-xxxx.too     1
xxxxxxx 0.30154 cbng_core  tools.cluebo r     03/04/2017 13:32:11 continuous@tools-exec-xxxx.too     1
xxxxxxx 0.30092 cbng_bot   tools.cluebo r     03/04/2017 23:56:38 continuous@tools-exec-xxxx.too     1
Next check the bot logs
tools.cluebotng@tools-bastion-03:~$ tail -f ~/logs/cluebotng-2017-03-05.log
cluebotng.INFO: Processing: [[Something]] [...]
Then check the IRC Feeds (
  • #cluebotng-spam - Should have constant messages
  • #wikipedia-VAN - Should have a message within 10min

Confirm the bot is also making edits inline with messages reported in #wikipedia-VAN.

Finally, check the review interface

General Checks

After a couple of hours, check:

  • / are successfully running externally
  • ~/mysql_backups/ contains valid database dumps
  • No ‘not running’ emails have been received
  • ~/bigbrother.log for restarts
  • ClueBot Commons For user problems


The main bot log provides a good indicator as to the source of problems but has limited data due to the logging volume.

It is common to see ‘Failed to get edit data for xxx’, this is only a problem if it’s happening for a large number of changes; normally this is due to delayed replicas, causing the user/page metadata for new users/pages to be non-existent.

The relays generally don’t break but may have incorrect entries in the database. The simplest fix is to kill the job and let it re-spawn.

The report interface will likely break due to PHP being updated, it will need fixing randomly; there is a motivation to rebuild the interface to include the review functionality as well as OAuth based authentication (T135323).