|If you would like to get on the Math Linguistics/Natural Language Processing email list or present a talk, please contact Ed Stabler at email@example.com.|
On Formalizing Syntax
Since at least the mid 20th Century, formalization of a theory of syntax has generally meant providing a generative grammar or abstract automaton which either generates or recognizes a set of mathematical structures which properly analyze the expressions of the language under study. In practice, this approach is quite difficult and often involves specification of a great deal of seemingly arbitrary detail. This has led to at least two sorts of skepticism about the usefulness of formalization: --- It is too early in the development of the theory to formalize it. --- The properties of the usual classes of formal languages do not seem to coincide with the regularities of natural language, hence formalization may actually mislead the process of theory formation. But formalization, in general, has well-known benefits. By expressing the theory in precise mathematical terms one obtains an unambiguous statement of the claims the theory is making along with, in principle, the means to explore their consistency and consequences and to evaluate them against empirical evidence. In this talk we will discuss a much broader approach to formalization of syntax in which the set of intended analyses is defined as a set of mathematical structures using constraints formalized in the language of mathematical logic, i.e. using Model-Theory. This provides a very general language for expressing hypothesized constraints which supports the successive refinement of the theory; as new constraints are added, the set of licensed structures becomes a successively better approximation of the intended set of analyses. Hence, the benefits of formalization are available at all stages of theory development. Of particular interest to us is the Model-Theory of finite structures (Finite Model-Theory). In this realm, the abstract properties of the definable sets can often be determined by characterizing them in automata-theoretic terms. The sets of strings that are definable using Monadic Second-Order Logic, for instance, turn out to be all and only those that are recognizable by finite state automata; the sets of trees that are definable in MSO turn out to yield exactly the Context-Free Languages, etc. This answers the question of why formal languages should be relevant to theories of syntax. A theory can be formalized model-theoretically if the set of structures it licenses can be picked out by very a very general class of logical constraints. The formal language theory serves as a tool for determining the abstract properties of the sets that are definable in this way. Any theory which can be formalized in this way, then, will exhibit these abstract properties, not as some sort of a priori assumption but simply as a consequence of its definability. We will lay out some of the foundations of this approach and will survey a range of results characterizing logical languages of varying expressiveness in terms of formal systems with correspondingly varying generative capacity.
2122 Campbell Hall