As part of work, I am reading about machine learning, deep learning, convolutional (CNN) and recurrent neural networks (RNN). In general, these fields aim to implement machine perception of the real-world understanding whether it be image recognition, handwriting, image classification, audio behavioral analysis, emotional gestures et al. General rules-based machine learning is the first step in making machines understand the data sent to it whether it be text, image or audio/video. The next step of self-learning generally requires a form of deep learning and neural networks. Neural networks are biologically inspired form of computations. Traditional methods still rely on hand-engineered features. While newer methods such as RNN are preferred to handle sophisticated learning with minimum training.
RNNs and long-short-term memory (LSTM) algorithms are compute intensive. However, storage is more affordable as datasets also have grown larger. Parallel computing has advanced significantly in the last two decades. “Deep learning methods, in particular, those based on deep belief networks (DNNs), which are greedily built by stacking restricted Boltzmann machines, and convolutional neural networks, which exploit the local dependency of visual information, have demonstrated record-setting results on many important applications” (Lipton & Berkowitz, 2015, p.2).
Standard DNNs have limitations in terms of dependent training and test data. Recurrent neural networks (RNNs) are connectionist models with the ability to pass selected information across sequence steps while processing sequential data one element at a time. Thus RNNs can model input and/or output consisting of sequences of elements that are not independent (Lipton & Berkowitz, 2015, p.2). RNNs persist selective information that cuts across time sequences. Therefore, RNNs have a significant advantage that does not have to depend on the assumption of temporal and structural independence.
Dialog systems need to be able to converse with humans in the most natural way possible. Standard DNNs that cannot cut across temporal boundaries are not suitable. Sequential systems without time independence are desired and therefore RNNs are desired. Also, RNNs are necessary for applications such as robotic surgery and self-driving cars.
With this introduction, I recommend the following references to study: