An Entropy Metric for Regular Grammar Classification and Learning with Recurrent Neural Networks

Entropy (Basel). 2021 Jan 19;23(1):127. doi: 10.3390/e23010127.

Abstract

Recently, there has been a resurgence of formal language theory in deep learning research. However, most research focused on the more practical problems of attempting to represent symbolic knowledge by machine learning. In contrast, there has been limited research on exploring the fundamental connection between them. To obtain a better understanding of the internal structures of regular grammars and their corresponding complexity, we focus on categorizing regular grammars by using both theoretical analysis and empirical evidence. Specifically, motivated by the concentric ring representation, we relaxed the original order information and introduced an entropy metric for describing the complexity of different regular grammars. Based on the entropy metric, we categorized regular grammars into three disjoint subclasses: the polynomial, exponential and proportional classes. In addition, several classification theorems are provided for different representations of regular grammars. Our analysis was validated by examining the process of learning grammars with multiple recurrent neural networks. Our results show that as expected more complex grammars are generally more difficult to learn.

Keywords: complexity analysis; entropy; recurrent neural network; regular grammar classification.