Markov.Agent
SourceExposes the functor Agent.Make
which returns an Agent.S
module that is parameterised by the provided implementations of Agent.MarkovCompressorType
, Agent.RewardType
and Agent.RLPolicyType
.
Agent.S.act initial_policy
commences an infinite loop using the policy to take actions and produce observers (functions returning a state). When the observer resolves to a state, the loop repeats.
Handle the continuous-time stream of information from a system and compress the information into a Markovian state representation such that the sequence of states returned by sequential calls to observe
have the Markov property.
A reward function which is a map from a state
to a Reward.t option
.
A policy for infering an action
and an observer
given a state
and optionally a reward
.
module Make
(MarkovCompressor : MarkovCompressorType)
(Reward : RewardType with type state = MarkovCompressor.state)
(Policy :
RLPolicyType
with type state = MarkovCompressor.state
with type reward = Reward.t) :
S with type policy = Policy.t
A functor. Make MarkovCompressor Reward Policy
returns an Agent
module. For example, Policy
must be a type that includes the interface RLPolicyType
(e.g. it may be of type RLPolicyType
or a 'super-type' of RLPolicyType
).