Layers that convert policy or Q-value vectors into action ids

agentnet.resolver.BaseResolver(incoming, name='BaseResolver', output_dtype='int32')[source]

Special Lasagne Layer instance, that determines actions agent takes given policy (e.g. Q-values),

agentnet.resolver.EpsilonGreedyResolver(incoming, epsilon=None, seed=1234, name='EpsilonGreedyResolver', **kwargs)[source]
Epsilon-greedy resolver:
  • determines which action should be taken given agent’s policy,
  • takes maximum policy action with probability 1 - epsilon
  • takes random action with probability epsilon
agentnet.resolver.ProbabilisticResolver(incoming, assume_normalized=False, seed=1234, output_dtype='int32', name='ProbabilisticResolver')[source]
instance, that:
  • determines which action should be taken given policy
  • samples actions with probabilities given by input layer