- A.G. Barto and R.S., Reinforcement Learning, MIT Press, 1998.
- Bertsekas, D. P. and Tsitsiklis, J. N. (1996). Neural Dynamic Programming. Athena Scientific, Belmont, MA.
- Gardner (1981). Samuel's checkers player. In Barr, A. and Feigenbaum, E. A., editors, The Handbook of Artificial Intelligence, I, pages 84--108. William Kaufmann, Los Altos, CA.
- Samuel, A. L. (1967). Some studies in machine learning using the game of checkers. II---Recent progress. IBM Journal on Research and Development, pages 601--617.
- Tesauro, G. J. (1994). TD--gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6(2):215--219.
- Tesauro, G. J. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 38:58--68.
- Tsitsiklis, J. N. and Van Roy, B. (1996). Feature-based methods for large scale dynamic programming. Machine Learning, 22:59--94.
 |
|