A reinforcement learning algorithm for exploring large Markov decision processes (MDP) implemented in Clojure.