0%
Robotics 学习笔记
发表于:
分类于:
学习
Problem Framework
Markov Decision Process (MDP)
- Discrete time step, can be continuous space of action and state
- We don’t know the exact outcome of the action
- Once the action is performed, we know exactly what happened
- The agent’s state is known (fully observed) – observation and the state is the same here
Formally defined as a 4-tuples (S, A, T, R):
- State Space
- Action Space
- Transition Function
- Reward Function
Partially Observable Markov Decision Process (POMDP)
Almost the same as MDP, except: the effect of the action are not known exactly before the action is performed (non-deterministic action effects)
Azure B系列虚拟机软中断很高的问题排查
发表于:
分类于:
折腾
Azure Linux 服务器自动重启问题
发表于:
分类于:
折腾