0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

[Review] Temporal Difference Learning and TD-Gammon

Last updated at Posted at 2018-07-16

Paper Detail

Autor: Gerald Tesauro
Published Year: 1998
Link: https://courses.cs.washington.edu/courses/cse590hk/01sp/Readings/tesauro95cacm.pdf

What is TD-Gammon?

This is the basic algorithm combining neural network (Multi-layer perceptron) with TD-Learning. And this approach has outperformed the human champion.
The model used one hidden layer and TD($\lambda$) algorithm.

What is his claim?

  1. Delayed Reward
  2. Limits on the scale of the learning algorithm(either look-up tables or linear function approximation)

What is the context of this research?

  1. Development of SL algorithms(Decision Tree, SVM and so on)
  2. Invention of TD-Learning
    Because of these two developments, this TD-Gammon algorithm could surpass the issues beforementioned.
    And He looked back his previous approach named NeuroGammon and has adapted the handcrafted feature engineering technique to this algorithm.

Model

Screen Shot 2018-07-16 at 15.01.20.png Input: raw feature set of Backgammon Output: probability of the each four potential actions

Any further discussion?

Why did TD-gammon work?
https://pdfs.semanticscholar.org/c1ec/5116d71176aaca80b1df944c01e82cc35212.pdf

According to this paper, the authors were able to replicate the algorithm using a neural network and hill climbing optimisation approach. Indeed, they didn't use TD-learning or even reinforcement learning approach at all.
So they claimed that the success introduced by Tesauro's TD-Gammon had to do with more stochasticity in the game itself, since the way to play the game is that each player rolls the dice and place the stone in turn. So the dynamism could actually help a lot for the agent to learn.

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?