Learning for Decisions

The explosion of the use of machine learning (ML) algorithms on unstructured high dimensional data has brought renewed attention to Hidden Markov Models (HMMs) as the prototypical description of a stochastic process with latent variables. One caveat of ML approaches is that they result in models with large dimensions (number of discrete latent variables). This problem is even more severe in the context of artificial intelligence (AI) as high dimensional Markov Decision Processes (MDPs) are learned on the fly to enable the design of controllers. This research focuses on developing a foundational theory for model reduction of HMMs and MDPs and on connecting model reduction to statistical learning theory. Ultimately, this will have a large impact on AI with particular benefits to reinforcement learning. 

In order to develop a foundational theory for model reduction of HMMs, we have to address multiple questions:

1) what metrics should be used for this reduction and for what situations,

2) can we derive lower bounds on the error for the best reduced model of a fixed order using a set of underlying invariants of the original model,

3) can we derive model reduction algorithms that can guarantee an error within a factor from the lower bound,

4) can this understanding provide a direct penalty to address model structure selection using statistical learning theory?

Similar questions need to be addressed for MDPs with the additional complexity of modeling the input-output behavior. Apart from the utility of such development in many applications, a model reduction theory will impact many fundamental aspects related to these stochastic models including simulation, prediction, coding, decision design and reinforcement learning. The latter is emerging as a critical approach for many applications involving social behavior where simple mechanistic models do not exist. Examples of such problems are critical infrastructures and smart services where high dimensional unstructured data is available in real time. Finally, this research will develop new insights to address a similar question for other stochastic models including graphical models.

< Back to Research Interests