Mathematical Structures in Language (LING 218)
Topic: "Counting Techniques and Statistics"
Winter 2005
Ed Keenan and Marcus Kracht
 Time: MW 11:00 S 1:00
 Place:
 Prerequisites: Ling 180/208 or equivalent
 Instructors: Ed Keenan & Marcus Kracht
 Work required: weekly homework problems
Short Description of the Course
This course shall cover two areas: combinatorics and statistics. The
content will be posted soon. There will be a syllabus available in
due course (see below for a download).

Weeks 1  3:
Basic counting techniques:

cardinalities of disjoint unions (addition),

cross products multiplication),

function spaces (exponentiation),

permutations (with fixed points),

partitions (equivalence relations)

and the "n choose k" notation.
Practice defining bijections since a major way of counting a set is
to show that it is bijectively related to a set we already know how
to count. We end with some recurrence relations and an introduction
to elementary probability. Linguistic applications: measuring the
strength of constraints. For example extensional adjectives are
restricting: Adj+N is a subset of N. How many maps from sets to sets
are restricting? How many Det denotations satisfy the conservativity
condition? How many are intersective?
 Weeks 4  6:
 Probability spaces: definitions and basic laws;
 Bayesian statistics;
 Basic probability distributions
 Random variables; expectation, variance.
This will enable us to first of all set up the tools for
statistical investigation in the first place: the probability spaces,
and how they are related to a particular problem at hand. Once this
is set up we can ask: what is the likelihood of a random letter (or
a random sound) being equal to "e" or "r"? What is the likelihood that
"e" follows "p"? And so on. Since the probability of e is obtained
by dividing the number of overall cases divided by number of cases
in which e takes place, the counting techniques developed in Weeks
1  3 will be put to immediate use.
 Week 7:
Parameter estimation (e.g. maximum likelihood). Once we have decided
on particular type of model there are various parameters to fix.
For example, a Markov model for the letter sequences (or other
applications) require fixing the transition probabilities. But
how do we get them in the first place? This is a very important
issue, as there are theoretically reliable methods which often are
impractical, so compromises have to be found (whose validity is
once again dependent on statistical facts).
 Week 8:
Hypothesis testing (ttest; chisquare test). Also, once everything is
in place and we have two competing hypotheses A and B concerning
our data, which one fits better? Again, various methods of testing
have been proposed, all with their own merits and drawbacks.
 Weeks 9 & 10:
Markov and hidden Markov models; the ergodic theorem (behavior in the
limit). Markov models play an important role eg in speech recognition
and tagging. They rely on predicting the future from the present,
without any memory of the past. The ergodic theorem states that in
the limit in almost all cases whatever the initial probability
distribution is, in the limit the probability distribution is that
of the single ergodic state. This allows to extract from the model
the probability of occurrence of individual states.
Course Material