Starts from March 06, 2018


Using Computer Vision to automatically control the agent in Mirror's Edge Catalyst. Mirror's Edge Catalyst is video game focusing on parkour, which is an extreme sport where the agent has to pass through complex obstacles with skills.

03/06/2018 04:57


03/06/2018 05:07

Log5 (Happy BDay)

  • It can jump smoothly when crossing the lower obstacles.
  • It learns when to do the higher jumping.
  • It learns from dataset collected at night and deploys it in the day scene.
  • Sometimes it might jump even when there isn’t an obstacle.
  • The model can only control jumping.

03/06/2018 18:14


  • Visualize the captured image in real time.
  • Display the confidence in a scrolling style rather than in one line.
  • Attention model.

03/15/2018 03:43

C# position

03/15/2018 06:23

Python position

Hmm. Became clumsy these days. It took me half day to hack the position from the memory. But now I can access the position from the Python.


  • (optional) Write memory (to make the reset more convenient).
  • (Plan 1) Record the trace and learn to infer the action from the visual input.
  • (Plan 2) Pure RL. Only record the trace, using it as a dense reward (distance based).
  • (Plan 3) Given a trace, we sample few keypoints from it. The agent has to infer the entire trace from them. It can use visual input (received in the exploration) to infer that path.
  • (Ultimate goal) Given the start and the end postition, the agent has to figure out the path from A to B. (with or without the runner vision).

03/15/2018 06:36


  • Use python to show the info above the game window.
  • (optional) Write memory
  • Jevois control.

03/15/2018 20:14

Random thoughts

  • Incremental learning
    • Increase the distance between the start and the end.
  • Place embedding/ Orientation embedding
    • Embed with a visual input
    • Close your eye and do an imaginary navigation
    • People seldom discretize the world and use coordinate to represent their position when taking actions. We only do that on a map.
    • Relative-position is more important and easier to generalize than absolute-position.

03/17/2018 14:31


  • Built a DDQL environment. Trained for the entire night.
  • Use the speed as the reward. ([1, 1e-2, 1] weighted on xyz-axis)
  • Can move pretty far (after 7 hours).
  • Faith tends to commit a suicide since dropping gives a large velocity.
  • Jumping and moving have a greater Q-value since they provide acceleration.


  • Cannot move continuously.
  • Tends to jump from the top of the building.