Abstract:  Modeling Productivity with the Gradual Learning Algorithm: The Problem of Accidentally Exceptionless Generalizations

Adam Albright

Bruce Hayes

Massachusetts Institute of Technology

University of California, Los Angeles

To appear in Gradience in Grammar: Generative Perspectives, edited by Gisbert Fanselow, Caroline Fery, Matthias Schlesewsky and Ralf Vogel.  Oxford:  Oxford University Press.

Stochastic Optimality Theory and its affiliated Gradual Learning Algorithm (Boersma 1997, Boersma and Hayes 2001) have been used to model free variation, match corpus frequencies, and analyze gradient well-formedness. Here we report a novel application, that of distinguishing linguistically significant from accidentally true generalizations. The latter arise as a by-product of a separate algorithm we have devised which discovers long-distance rule environments inductively from a corpus of input data. In developing the latter algorithm, we encountered a problem: along with the correct environments, it discovered complex "junk" environments, which hold true only because they just happen to match the forms in the training set. These junk environments produced grammars that failed to derive correct novel forms.

To solve the problem, we propose a quantitative criterion of the generality of phonological environments, which is used to assign initial ranking values to the constraints. Starting from these values, the GLA constructs a ranking in which only the valid constraints are ranked high. The "junk" constraints are ranked low enough that they are never active, hence they are rendered vacuous.

In support of our proposals, we show that they can be used to learn a version of Navajo Sibilant Harmony which is obligatory in local environments and optional in distal ones.

We conclude by mentioning another case of accidentally-exceptionless constraints, those governing small-scale generalizations, and speculate on how such constraints can best be ranked in stochastic OT.


Last modified June 15, 2004