project / blinker

See project at github.com.

The blinker wrapper can wrap any Gym environment, adding an additional, parallel observe action. observe costs a configurable amount of reward (i.e. it produces a negative reward), but is required to obtain a fresh observation. If observe is not chosen, the observation will remain stale. This forces agents to choose the best times to observe, and to avoid observation if they can predict the relevant world state.

Additionally, the render(human=true) method will show a visual indication when an observation is being made.