0%
Robotics 学习笔记
发表于:
分类于:
学习
Problem Framework
Markov Decision Process (MDP)
- Discrete time step, can be continuous space of action and state
- We don’t know the exact outcome of the action
- Once the action is performed, we know exactly what happened
- The agent’s state is known (fully observed) – observation and the state is the same here
Formally defined as a 4-tuples (S, A, T, R):
- State Space
- Action Space
- Transition Function
- Reward Function
Partially Observable Markov Decision Process (POMDP)
Almost the same as MDP, except: the effect of the action are not known exactly before the action is performed (non-deterministic action effects)
Azure B系列虚拟机软中断很高的问题排查
发表于:
分类于:
折腾
Azure Linux 服务器自动重启问题
发表于:
分类于:
折腾
神经网络训练一开始准确率很高然后逐渐下降的问题排查
发表于:
分类于:
学习
现象
神经网络训练,一开始准确率很高,然后逐渐下降。如下所示:
Epoch Time Train Loss Train ACC Val Loss Val ACC Test Loss Test ACC LR
1 197.8234 0.0053 0.8645 0.0412 0.1443 0.0412 0.1443 0.0100
2 108.6638 0.0084 0.7311 0.0272 0.1443 0.0272 0.1443 0.0100
3 108.4892 0.0095 0.6777 0.0267 0.1443 0.0267 0.1443 0.0100
4 108.8819 0.0087 0.7102 0.0269 0.1443 0.0269 0.1443 0.0100
5 108.8337 0.0065 0.7712 0.0504 0.1443 0.0504 0.1443 0.0100
6 109.4179 0.0061 0.8071 0.0624 0.1443 0.0624 0.1443 0.0100
7 109.2300 0.0057 0.8349 0.0762 0.1443 0.0762 0.1443 0.0075
8 109.2820 0.0101 0.6432 0.0245 0.1443 0.0245 0.1443 0.0075
具体现象是 Train ACC 一开始特别高,但 Val ACC 很低。随着 epoch 增加, Train ACC 开始下降,Val ACC 几乎不变。