Abstract
It is customary to expect that learning systems will frequently underperform or even fail in their initial stages of maturity. In fact, heuristic systems are somewhat encouraged to “fail” in order to expose the algorithm to the entire problem space, in the belief that this turbulent, knowledge-building process is an essential precursor to better performance later. This is clearly illustrated in learning a new game, where initial, erratic play is essential if all the rules are to be learned. Good and bad “moves” must be identified and cataloged so that strategies for avoiding, or at least postponing, losses (failures) can be formulated. Intelligent systems that use game-playing paradigms pose profound questions that must be addressed before any level of confidence can be placed in their ability to perform in the real world. This paper describes a game-playing methodology that has been used to control short-duration tasks and discusses its adaptation to continuous (and possibly ill-defined) processes. The paper postulates that system failures can be turned into opportunities for positive statistical reinforcement.
Original language | English (US) |
---|---|
Pages (from-to) | 555-566 |
Number of pages | 12 |
Journal | Cybernetics and Systems |
Volume | 25 |
Issue number | 4 |
DOIs | |
State | Published - 1994 |
All Science Journal Classification (ASJC) codes
- Software
- Information Systems
- Artificial Intelligence