Module Markov.AgentSource

Exposes the functor Agent.Make which returns an Agent.S module that is parameterised by the provided implementations of Agent.MarkovCompressorType, Agent.RewardType and Agent.RLPolicyType.

Agent.S.act initial_policy commences an infinite loop using the policy to take actions and produce observers (functions returning a state). When the observer resolves to a state, the loop repeats.

Sourcemodule type MarkovCompressorType = sig ... end

Handle the continuous-time stream of information from a system and compress the information into a Markovian state representation such that the sequence of states returned by sequential calls to observe have the Markov property.

Sourcemodule type RewardType = sig ... end

A reward function which is a map from a state to a Reward.t option.

Sourcemodule type RLPolicyType = sig ... end

A policy for infering an action and an observer given a state and optionally a reward.

Sourcemodule type S = sig ... end

An MDP Agent. The output signature of the functor Make.

A functor. Make MarkovCompressor Reward Policy returns an Agent module. For example, Policy must be a type that includes the interface RLPolicyType (e.g. it may be of type RLPolicyType or a 'super-type' of RLPolicyType).