问答题

There are two optimal policies for Dynamic Programming, one is ______________, and the other is policy iteration.
动态规划有两种优化策略,一个是___________,而另一种是策略迭代。

【参考答案】

value iteration 值迭代