blue

Defending R3ACE: Defending Replicable and Reproducible Real-World Autonomous Cyber Environments

blue paves the way to applied online learning on real-world, continuous-time, concurrently-running systems by bridging the gap between real systems and MDP-assuming online learning agents.

Take me to the docs

Blue is factored into two packages:

What is R3ACE?

The R3ACE project makes publically accessible a real-world cyber environment that models a minimum viable cyber system.

R3ACE is a real computer network (cyber infrastructure) with a cyber defence software program, the blue program running on one of the machines.

What does the blue program do?

  1. Fetches information from the cyber system (the surrounding computer network).
  2. Uses a policy to decide what (if any) action to take.
  3. Executes this action, causing a side effect in the cyber system (e.g. an IP address is added to a block list).

The above design for a program should infact be useful for the application of Autonomous Cyber Defence (ACD) to many realistic, or indeed real-world cyber systems. As such, we have designed and documented a reusable abstract software interface to generalise over different cyber systems and policies - the markov library.

What is a reusable software interface?

The markov library defines several module types, these can be thought of blueprints or specifications for software modules. It is up to the implementer of the interface to write software that conforms to these type specifications.

For example, the specification (or interface) for a Reward module, defining rewards for online learning agents:

(** A reward function which is a map from a [state] to a [Reward.t option]. *)
module type RewardType = sig
  type t
  type state

  val fn : state -> t
  (** [Reward.fn s] is the [Reward.t] associated with the MDP state [s]. *)
end

There are many possible implementations for this module - what's fixed is how this module interacts with other modules: the types exchanged at the boundary (the interface) are defined explicitly, and can be depended on.

The markov library defines how several modules (once implemented) may be combined (via a functor) to compose a useful piece of software that behaves in a specific way - in the case of the markov library, observe and affect a surrounding cyber system.

How could markov be used in a different project?

The blue program implements the Agent interface exposed by the markov library, so looking at this implementation in more detail is a good place to start. A different implementation could serve a different purpose, for example an agent that plays a video game.