blue
blue
paves the way to applied online learning on real-world, continuous-time, concurrently-running systems by bridging the gap between real systems and MDP-assuming online learning agents.Blue is factored into two packages:
blue
program, with a library of helper modulesblue
is an implementer of this interface)The R3ACE project makes publically accessible a real-world cyber environment that models a minimum viable cyber system.
R3ACE is a real computer network (cyber infrastructure) with a cyber defence software program, the blue
program running on one of the machines.
blue
program do?The above design for a program should infact be useful for the application of Autonomous Cyber Defence (ACD) to many realistic, or indeed real-world cyber systems. As such, we have designed and documented a reusable abstract software interface to generalise over different cyber systems and policies - the markov library.
The markov library defines several module types, these can be thought of blueprints or specifications for software modules. It is up to the implementer of the interface to write software that conforms to these type specifications.
For example, the specification (or interface) for a Reward module, defining rewards for online learning agents:
(** A reward function which is a map from a [state] to a [Reward.t option]. *)
module type RewardType = sig
type t
type state
val fn : state -> t
(** [Reward.fn s] is the [Reward.t] associated with the MDP state [s]. *)
end
There are many possible implementations for this module - what's fixed is how this module interacts with other modules: the types exchanged at the boundary (the interface) are defined explicitly, and can be depended on.
The markov library defines how several modules (once implemented) may be combined (via a functor) to compose a useful piece of software that behaves in a specific way - in the case of the markov library, observe and affect a surrounding cyber system.
The blue
program implements the Agent interface exposed by the markov library, so looking at this implementation in more detail is a good place to start. A different implementation could serve a different purpose, for example an agent that plays a video game.