K-ToBI (Korean ToBI) Labelling
Conventions
(version 3.1, in November 2000)
Sun-Ah Jun
Dept. of Linguistics , UCLA
1. Background
K-ToBI (Korean TOnes and Break Indices) is a prosodic transcription convention for standard (Seoul) Korean. It is based on the design principles of the original English ToBI (see Silverman et al., 1992; Beckman & Hirschberg, 1994; Pitrelli et al. 1994), and the Japanese ToBI system (J_ToBI), devised by Jennifer Venditti (see Venditti, 1995; Campbell & Venditti, 1995). Like the other ToBI systems, therefore, K-ToBI assumes intonational phonology with a close relationship to a hierarchical model of prosodic constituents as proposed by Pierrehumbert and her colleagues (e.g., Pierrehumbert 1980, Beckman & Pierrehumbert 1986, Pierrehumbert & Beckman 1988). The intonational analysis and attendant prosodic model of Seoul Korean adopted for K-ToBI are based on Jun (1990, 1993, 1996, 1998; also see Lee 1989 and de Jong 1989 for earlier studies). A first version of K-ToBI was developed at ATR Interpreting Telecommunication Systems in Japan in late 1994 by Mary Beckman and Sun-Ah Jun, as part of Korean synthesis development project. The second version (Beckman & Jun 1996) was an updated version modified in November 1996 by the same authors in accordance with the discussion of the Japanese/Korean working group at the Prosody Transcription Workshop held just before ICPhS (International Congress on Phonetic Sciences) in Stockholm, August 1995. The current version is a revised one from the second version by Sun-Ah Jun after Korean ToBI Workshop in Korea, August 1998. This version was presented at the workshop “Intonation: Models and ToBI Labelling”, a satellite meeting of ICPhS in San Francisco in August 1999. Before introducing the revised K-ToBI labelling conventions, a brief description of the intonational structure of Seoul Korean proposed in Jun (1993, 1998) is in order.
1.1 Intonational structure of Seoul Korean
The intonational structure of the standard dialect (=Seoul) of
Korean has
two intonationally defined prosodic units: Intonation Phrase (IP) and
Accentual
Phrase (AP). An AP is smaller than an IP and larger than a
phonological
word, a lexical item plus a case marker or postpositions. An IP
is
marked by a boundary tone (%) and final lengthening. An AP is
marked
by a phrasal tone, THLH (T=H if the AP initial segment is aspirated or
tense,
T=L otherwise), but not by final lengthening. The intonational
structure
of Seoul Korean is schematically represented in Figure 1.
Figure 1. Intonational Structure of Seoul Korean
IP: Intonation Phrase, AP: Accentual Phrase
w: phonological word, s: syllable
T= H, when the syllable initial
segment is aspirated/tense, otherwise, T= L
%: Intonation phrase boundary tone
2. Structure of K-ToBI
The original ToBI system (i.e., English ToBI) has four parallel tiers (word, tone, break-index, and miscellaneous), but allows the free proliferation of site-specific extra tiers. Sites with aligner for English, for example, have generally added a phones tier for phonetic segmentation, and J_ToBI users have agreed to add an obligatory “finality” tier where intonational phrases that sound “final” to a discourse turn taking are minimally marked as such (until they can develop a more complete discourse model of discourse finality to govern a hierarchy of labels for this tier). In accordance with this general design principle, the current version of K-ToBI expands a tone tier into two tiers, a phonological tone tier and a phonetic tone tier, in order to describe surface tonal patterns which are not predictable from the underlying tones. Therefore, a K-ToBI transcription for an utterance consists minimally of a recording of the speech, an associated record of the fundamental frequency contour, and the transcription-proper symbolic labels for events on the following five parallel tiers:
The expansion of the tone tier
was devised
to label the surface tonal pattern of an accentual phrase (= AP)
separately
from the underlying tones marking the AP boundary. This was
motivated
by the following four reasons. First, the ToBI labeling system
assumes
that tones are labeled only when they are distinctive (Beckman &
Ayers
1994, http://ling.ohio-state.edu/~tobi/). Non-distinctive pitch
events
that are automatically extractable from the signal should not be
labeled.
This is true for English ToBI. However, in Korean, distinctive
pitch
events do not come from an individual phrasal tone but as a set of
tones
forming an AP. Furthermore, though the most common tone pattern
of
an AP is LHLH or HHLH when the AP is longer than three syllables, an AP
in
Seoul Korean can be realized in at least fourteen different tonal
patterns,
with more variation when the AP has fewer than three syllables (i.e.,
LH,
LHH, LLH, LHLH, HH, HLH, HHLH, LL, HL, LHL, HHL, HLL, LHLL,
HHLL). Though these various patterns do not seem to differ in
meaning among themselves, and though they do not seems to be
predictable, it is not yet known if all these variations are indeed
neither distinctive nor predictable. By labelling surface tonal
patterns, we will be able to investigate if there is any meaning
difference among these patterns.
Second, the
earlier version of K-ToBI labels only two types of tones for an AP:
‘H-’ marking an AP initial H tone, when realized, and ‘LHa’ marking the
end of an AP. When there is no initial H in an AP, H- was not
labelled, conforming to the
surface realization. However, in the rare event that an AP-like
phrase
ends in an L tone, that tone was labelled ‘L%’ instead of ‘La’
since
a phrase final L tone was found at IP final position most of the time
and
we did not want to increase the tonal inventory of AP without enough
evidence.
Then, in order to indicate that the AP-like boundary juncture does not
match
the tone pattern, a break index ‘2m’ was placed on a break index tier:
the
degree of juncture is the same as that of the usual AP boundary, i.e.,
‘2’,
but the tonal mark, L%, shows the boundary of an Intonation
Phrase. Sometimes this was indeed the case. However,
observation of more natural data revealed that there are AP boundaries
which are sometimes realized
in an L tone due to the tonal interaction of adjacent tones and
stylistic
variations. At the moment, the detailed condition of an AP final
L
tone and its pragmatic meaning are not known. We hope to get
answers
to these issues by labelling a falling AP boundary as ‘La’ on the
phonetic
tone tier.
By allowing ‘La’
to mark an AP boundary, this revised version now has a different
definition of the break index ‘2m’. Before, it was used for a
mismatch between tone and break index covering two cases: “2-like break
but not AP-like tone” and “AP-like tone but not 2-like break”. In
the current version, a break
index ‘2m’ refers only to the former: “2-like break but not AP-like
tone”.
“AP-like tone but not 2-like break” will be labelled in two ways
depending
on the degree of perceived juncture: either 1m (1-like break with
AP-like
tone) or 3m (3-like break with AP-like tone).
Third, the AP
initial tone in Seoul Korean is in general either L or H depending on
the initial segment of an AP: H when the segment is aspirated or tense,
but L otherwise. Regardless of this tonal difference on the first
syllable of an AP, the
second syllable of an AP is H when the AP has more than 3
syllables.
As a result, an AP can have H on the first syllable or on the second
syllable
or both. In the earlier version of K-ToBI, we labeled ‘H-’ at the
first
occurrence of a high pitched syllable, either the first or second
syllable
or rarely on the third syllable, without considering the origin of the
H
tone or the alignment of the peak to syllables. However,
quantitative
data show that the phonetic realization of these H tones differs
depending
on their origins and locations. F0 is significantly higher for
the
H tone on the first syllable of an AP (i.e., HHLH) than the H
tone
after the AP initial L tone (i.e., LHLH). In addition,
this
extra-high f0 value in the beginning of the HHLH pattern influences the
following
syllables, if there are any, by raising the f0 values of these
syllables,
compared to those in the LHLH pattern, up to the penultimate syllable
of
an AP (see Lee (1999) for more detail). Assuming that the initial
L
in LHLH or the second H in HHLH is predictable, we did
not
label these tones in the earlier version. But it turns out
that
these are not always predictable, and furthermore, as mentioned
earlier,
the individual tone itself forming an AP does not seem to be
meaningful.
That is, all the surface tonal variations deviant from the underlying
tonal
sequence do not seem to have different meanings. What is
meaningful
in Korean intonational phonology is the phrasing, marked by the
boundary
tone of an AP and an IP. For example, wh-questions and
yes/no-questions
are distinguished only by intonational phrasing (Jun & Oh 1996) and
ambiguous
sentences are disambiguated by phrasing differences (Schafer & Jun,
submitted).
Therefore, in this revised version, we will label the AP and IP
boundaries
at a phonological tone tier, and the individual AP tones at a phonetic
tone
tier aligned with the corresponding surface f0 event. Labelling
surface
tonal events at a phonetic tone tier will provide us data by which we
can
determine what the pragmatic meaning of these tones is, if there is
any,
and get information about the timing and magnitude of the f0
realization
of these tones. This will provide valuable information to
researchers
working on speech synthesis and recognition.
Fourth, by
separating a tone tier into phonological and phonetic tone tiers, we
can easily accommodate tonal transcriptions of other dialects.
For example, unlike Seoul Korean,
the tonal pattern of an AP in the Chonnam dialect (Southwestern dialect
of
Korean) is LHL or HHL (Jun 1989, 1993, 1996, 1998), with the
alternation of the AP initial tone being caused by the same principles
as in Seoul. Though the tonal patterns differ between the two
dialects, the accentual
phrasing is the same for these dialects. Thus, the boundaries
marked
in a phonological tone tier for Seoul Korean will remain the same for
the
Chonnam dialect, while a phonetic tone tier of these two dialects will
differ
conforming to the surface realization of each dialect. I assume
this
will be true for other dialects of Korean which do not have a lexical
pitch
accent.
In the following
sections, each of the five tiers is defined, and labels and symbols
proper
for each tier are introduced. In addition, example sentences
illustrate
in a text format how to label information on each tier, and pitch
tracks
of all sentences are shown in Appendix B.
3. Tiers
3.1. The word tier
The word tier in K-ToBI
corresponds to the “orthographic tier” in English ToBI. In this
tier, words may be labeled using either Hangul orthography or some
conventional romanization, depending on what is more convenient for the
users’ labeling platform or on what is most appropriate for exporting
to relevant applications. In the current K-ToBI, words are
transcribed following the Romanization convention, originally used at
KAIST, Korea, and adopted by ATR, Japan. A table showing the
mapping between Korean alphabet, IPA symbol, and a Roman letter is
given in Appendix A.
What constitutes a
“word” in Korean is controversial, and we anticipate that different
sites may find that the intended applications pose specific needs as to
how finely an utterance should be broken up into words. For
example, the intended applications at one site might require that a
word label be placed for each morpheme
string that has its own separate entry in some on-line
dictionary.
Another site may want to label a word as often as there are spaces in a
standard Hangul transcript of the text. In this version, we
consider
‘word’ as a sequence of segments divided by a space in a written Hangul
text.
That is, a word will be labelled at the end of each Hangul item
separated
by space.
If the labeling
platform is xwaves and xlabel (or any similar labeling platform such as
PitchWorks that works in terms of time flags), the word label
should be placed at the end of the final segment in the word, as
determined by the labeler from the waveform or spectrogram record. That
is, each word should be marked at its right edge. Filled pauses
and the like should also be labeled using some site specific convention
for the Hangul or romanized spelling.
3.2 A phonological tone tier
A phonological tone tier will be
used to mark the boundary tone of an Intonation Phrase (IP) and the
boundary
tone of an IP-medial Accentual Phrase (AP). Since an AP boundary
tone
in an IP-final position is overridden by an IP-final boundary tone,
only
IP final boundary tone (%) will be labeled at the end of an IP.
To mark the end of an
AP, we will use ‘LHa’ as a short term for LHLHa or HHLHa. This
implies
that the most common AP final tone in Seoul Korean is a rising tone
(LH). To mark the end of an IP, we will use one of the nine
different boundary
tones, i.e. H%, L%, HL%, LH%, HLH%, LHL%, HLHL%, LHLH%, LHLHL%.
Instructions on where to put phonological tone labels are given
below. To simplify the description of IP boundary tones, ‘T’ is
used below as a variable of
the IP boundary tones. The meaning of each boundary tone and
sentence
examples labelled with phonological tones are given in the next section.
| LHa | marks the end of an IP-medial AP, aligned with the end of AP final segment determined from the waveform. The LHa tone should be placed at or just before the corresponding break index marker regardless of the actual location of the peak. |
| T% | marks the end of an IP, aligned with the end of IP final segment determined from the waveform. ‘T’ can be H, L, HL, LH, HLH, LHL, HLHL, LHLH or LHLHL. A T% tone at a phonological tone tier should be placed at or just before the corresponding break index marker regardless of the actual location of the peak. When a word is final to an AP and final to an IP, only the IP boundary tone is written at the end of the word. |
3.3 A phonetic tone tier
A phonetic tone tier will be used
to mark the surface realization of AP tones and IP tones. As for
AP tones, we will have three initial tones (i.e. L, H, and +H) and
three final tones (i.e. La, Ha, and L+). Among the initial tones,
L and H are for the tone on the first syllable of an AP, and +H is for
the tone on the second syllable (and sometimes the third when the AP is
long and focused) of an
AP. Among the final tones, La and Ha are for the tone on the
final
syllable of an AP, and L+ on the penult of an AP. Therefore, the
‘+’
sign in Korean ToBI refers to a syllable boundary and implies a
grouping
of tones; +H is part of the AP initial tone realized on the second
syllable
of an AP, and L+ is part of the AP final tone realized on the penult of
an
AP. This is different from the ‘+’ in English bitonal pitch
accents
such as L+H* or L*+H, where the starred tone is associated with a
stressed
syllable with the unstarred tone being realized either before (i.e., a
leading
L tone in L+H*), or after the starred tone (i.e., a trailing H tone in
L*+H).
When an AP has three
syllables, the tone on the second syllable can be either L (ex. LLH) or
H (ex. LHH). In this case, we will consider the medial L as a
part of the final AP tone and the medial H as a part of the initial AP
tone because we believe that both are derived from the underlying LHLH
pattern. That is, LLH is
parsed as L-LH with the undershoot of the first H of LHLH, and LHH is
parsed
as LH-H with the undershoot of the second L of LHLH. Therefore,
LLH
will be labelled as L, L+, and Ha, and LHH will be labelled as L, +H,
and
Ha, on each of the three syllables. The realizations and
locations
of three AP final tones and three AP initial tones are described below.
AP final tones:
| Ha | : This is the most common AP final tone of an IP-medial AP. It can be either the end of a rising tone or a high flat tone. This label is placed aligned with an actual f0 peak on the AP final syllable. |
| La | : This is a less common AP final tone, sometimes seen when the following AP begins with a H tone. This label is placed aligned with an actual f0 valley on the AP final syllable. |
| L+ | : This tone is not for the final syllable of an AP, but to label the low toned penultimate syllable of an AP, either before the AP final H tone or before the IP final H boundary tone. Do not label this tone if it is predictable from adjacent tone labels. For example, when an AP is continuously falling from an initial H to final La, L+ should not be labelled. Also when an AP initial is L and final is La, L+ should not be labelled. When not predictable, this label is placed aligned with an actual f0 valley on the penult of an AP. When there is no valley but only a low plateau, place this label at the beginning of the low plateau when preceded by an initial H, or at the end of the plateau when followed by a final H. |
AP initial tones:
| L | : This tone marks an L tone on the first syllable of an AP. This label should be placed aligned with the f0 valley on the first syllable of an AP. |
| H | : This tone marks a H tone on the first syllable of an AP. This label should be placed aligned with the f0 peak on the first syllable of an AP (but avoid the first pitch point at the beginning of a vowel which is most likely due to the segmental perturbation). |
| +H | : This tone marks the H tone on the second syllable (or sometimes the third syllable when the AP is long or uttered fast or produced under focus) of an AP. This label should be placed aligned with the f0 peak on the second syllable. When the peak continues over the following syllable, place this label aligned with the latest f0 peak of the phrase initial peak. |
Schematic f0 contours of fourteen types of AP realizations and corresponding phonetic tone labels are shown in Figure 2. The first row shows AP patterns with a high boundary, Ha, and the second row shows AP patterns with a low boundary, La. The third row shows contours of a long AP where all four underlying tones are realized with either a Ha or La boundary. ‘T’ in the last contour is either H or L.

Figure 2. Schematic f0 contours of fourteen tonal patterns of AP.
For the IP boundary
tones, the whole tone is placed toward the end of the IP final syllable
aligned
with the f0 maximum for H ending boundary tones (i.e.,
H%/LH%/HLH%/LHLH%) and the f0 minimum for L ending tones (i.e., L%/HL%/
LHL%/HLHL%/LHLHL%). For complex boundary tones which include H
before the last tone (e.g., HL%, HLH%, LHLH%, LHLHL%), the label ‘>’
should be placed at the f0 peak corresponding to each non-final H
tone. Here, ‘>’ can mean an ‘early peak’ as
in English ToBI (i.e. some examples of HL%; see next paragraph), but
most
of the time it simply indicates the location of H so that it provides
information about pitch range. At the moment, it is not clear if
complex boundary tones with more than 3 tones (i.e., LHLH%, HLHL%,
LHLHL%) have a distinct meaning of their own other than intensifying
the meaning of the less complex tones with 2 or 3 tones (e.g., HLHL%
intensifies the meaning of HL%). More K-ToBI labelled data would
be needed to clarify this issue. Until then, we will label all
boundary tones on the phonetic tone tier.
Currently, the type of
an IP boundary tone is determined by the f0 shapes realized on the IP
final
syllable. Though this is true most of the time, we found in news
broadcasting
that the H tone of HL% is sometimes realized on the penultimate
syllable
of an IP, possibly to keep the same rhythm across phrases. This
style
is also found in a movie or drama which describes the times of
Old
or Middle Korean, especially in the dialogues of high class
people.
In addition, Park (2000) found examples where H of HL% is realized
earlier
than the penult of an IP. This happened when an object is
postposed
after a verb whose boundary tone in the original sentence is HL%.
This
is one of the three possible ways of ‘afterthought’ realization in
Korean:
1) both the verb final syllable and the postposed object final syllable
carry the HL% tone, 2) the verb and the object form one IP, and the
object final syllable carries the HL% tone, and 3) the verb and the
object form one IP, but the HL% tone is split so that the H tone is
realized on the verb final syllable and the L tone is realized on the
object final syllable.
The third possibility is when the part of a boundary tone is realized
before
the IP final syllable. In this case, the label ‘>’ should be
placed
at the f0 peak of the verb final syllable. So far, this type of
split
boundary tone is found only for HL%, and more data are needed to see if
this
is possible for other boundary tones.
The following shows
surface realization rules of each boundary tone, and its location
relative to words and f0 contours.
IP final boundary tones:
| L% | : A level ending, or a gently falling boundary tone spread over much of the IP-final AP from the f0 peak at the beginning of the AP. This tone should be placed at the end of the phrase aligned with the minimum f0 value. This tone is the most common in stating facts, and in declaratives in reading. |
| H% | : A rising boundary tone that begins to rise before the IP final syllable, and reaches its peak during the final syllable. Therefore, the rise is earlier than that in LH%. This tone should be placed at the end of the phrase aligned with the maximum f0 value. This tone is the most common in seeking information as in yes/no-questions. |
| LH% | : A rising boundary tone that is more localized than H%, rising sharply from a valley well within the final syllable. That is, by comparison to H%, this is a sharper later rise, starting after the onset of the final syllable. This tone should be placed at the end of the phrase aligned with the maximum f0 value. This is commonly used for questions, continuation rises, and explanatory ending. It is also used to signal ‘being annoyed, unpleasant or disbelief’ (e.g., <gIrEtaniKa gIrEne!> ‘I have already told you so. (Why do you keep asking me?)’ or <bEryESE!> ‘(Did you) throw it out? (I can’t believe that!)’). |
| HL% | : A falling boundary tone that rises to a peak before the last syllable, and then falls during the last syllable. Though it seems to be a combination of H% and L%, the H part of this boundary tone is not as high as a simple H% and the L is not as low as a simple L%. This tone should be placed at the end of the phrase aligned with the minimum f0 value, and the location of H is marked by ‘>’ aligned with the f0 peak. This tone is most common in declaratives and wh-questions. It is also commonly used in news broadcasting. |
| LHL% | : A rising-falling boundary tone that, unlike HL%, rises within the IP final syllable -- essentially a combination tone consisting of LH% followed by L%, but the f0 peak is not as high as that of LH%. This tone should be placed at the end of the phrase aligned with the minimum f0 value, and the location of H is marked by ‘>’ above the f0 peak. It sometimes intensifies the meaning of HL%, but like LH%, it also delivers the meanings of ‘being persuasive, insisting, and confirmative’. It is also used to show annoyance or irritation. (e.g., <hazima>! ‘Don’t do it (I told you before)’) |
| HLH% | : A falling-rising boundary tone -- a combination of HL% and H%. That is, the timing of the rise is the same as HL% but followed by a shallow dip and then another rise. This tone should be placed at the end of the phrase aligned with the maximum f0 value. The location of the first H is marked by ‘>’ above the f0 peak. The tone is not as common as the other types mentioned so far, and some speakers use this type more often than others. This tone is used when a speaker is confident and expecting listeners’ agreement. |
| LHLH% | : A rising-falling-rising boundary tone. The timing of rise is like LH%. This tone should be placed at the end of the phrase aligned with the maximum f0 value. The location of the first H is marked by ‘>’ above the f0 peak. This tone is less common than others, and has a meaning of intensifying some of the LH%’s meanings, i.e., ‘annoyance, irritation or disbelief’. |
| HLHL% | : A falling-rising-falling boundary tone. The timing of rise is like HL%. This tone should be placed at the end of the phrase aligned with the minimum f0 value. The location of the two Hs are marked by ‘>’ above the f0 peak. This tone is more common than LHLH%, but not as common as single, bi- or tritonal boundary tones. It sometimes intensifies the meaning of HL%, confirming and insisting on one’s opinion, and sometimes, like LHL%, it delivers nagging or persuading meanings. |
| LHLHL% | : A rising-falling-rising-falling boundary tone. The timing of rise is like LH% followed by LHL%. This tone should be placed at the end of the phrase aligned with the minimum f0 value. The location of the two Hs are marked by ‘>’ above the f0 peak. This tone is rare and its meaning is similar to that of LHL%, but has a more intense meaning of being annoyed. |
Schematic f0
contours of eight types of IP boundary tone realizations are shown in
Figure 3. The first row shows an IP boundary ending with L% and
the second row shows those ending with H%. The vertical line
shown in each contour marks the beginning of the IP final
syllable. The f0 scale is not normalized.
Figure 3. Schematic f0 contours of eight boundary tones of IP.
Finally, for a
case of uncertain or underspecified tonal events, for both AP and IP,
use the
following labels at a phonetic tone tier. Underspecified tone
labels
are used when a labeler knows there is a tone, but has not assigned a
label
yet.
| X | : Underspecified tonal event of non-AP-final boundary tone. (Tone is there, but the tonal value have yet to be assigned) |
| a | : Underspecified AP-final tone |
| % | : Underspecified IP-final tone |
| X? | : Uncertain of the type of a tone, which is not an AP-final nor IP-final boundary tone. (a labeler is not sure of the tone type) |
| X?a | : Uncertain of the type of an AP-final boundary tone |
| X?% | : Uncertain of the type of an IP-final boundary tone |
Example sentences labelled with a phonological tone and a phonetic tone are shown below. File names are in “<< >>” and example sentences are given in Romanization of Korean alphabet (see Appendix A). F0 tracks of each example with corresponding labels are shown in Appendix B. “-early”, “-middle”, or “-late” indicates a region of the sound file.
Examples of tone labelling both
at a
phonological tone tier and a phonetic tone tier:
| Ex.1. | << 4boundary-H% >> | gIrASEjo | ‘Is that so?’ |
| phonological tone tier | H% | ||
| phonetic tone tier | +H L+H% | ||
| Ex.2. | << 4boundary-LH% >> | gIrASEjo | ‘Is that so?’ |
| phonological tone tier | LH% | ||
| phonetic tone tier | +H LH% | ||
| Ex.3. | << 4boundary-HL% >> | gIrASEjo | ‘Is that so?’ |
| phonological tone tier | HL% | ||
| phonetic tone tier | L+H L+>HL% | ||
| Ex.4. | << 4boundary-LHL% >> | gIrASEjo | ‘Is that so?’ |
| phonological tone tier | LHL% | ||
| phonetic tone tier | L+H > LHL% |
| Ex.5. | << J3A2-HLH% >> | onIR | zEnyEge | nuga | mEgEyo |
| phonological tone tier | LHa | HLH% | |||
| phonetic tone tier | L | L+Ha | L+H | L+ >HLH% | |
| ‘Today | night | who | eat?’ | ||
| -> | ‘Who | is eating | tonight?’ |
| Ex.6. | << IPboundary-HL% >> | baraMgwa | hANnimi |
| phonological tone tier | LHa | HL% | |
| phonetic tone tier | L Ha | H L+ > HL% | |
| ‘The North Wind and | the Sun-NOM ’ | ||
| ‘The North Wind and | the Sun .....’ |
| Ex.7. | << IPboundary-LH% >> | dubENCA, |
| phonological tone tier | LH% | |
| phonetic tone tier | L +H LH% | |
| ‘Second,’ |
| Ex.8. | << 2syllAP-LHa >> | nanIN | yEQarIR | miwEhAyo |
| phonological tone tier | LHa | LHa | L% | |
| phonetic tone tier | L Ha | L L+Ha | L+H L+ L% | |
| ‘I-TOP | Younga-ACC | hate’ | ||
| -> ‘I hate | Younga’ |
| Ex.9. | << 5syllAP-LHLHa >> | yEQmaNinenIN | yEQarIR | miwEhAyo |
| phonological tone tier | LHa | LHa | L% | |
| phonetic tone tier | L +H L+Ha | L L+ Ha | L +H L% | |
| ‘Youngman’s | family-TOP | Younga-ACC hate’ | ||
| -> ‘Youngman’s | family | hates Younga’ |
| Ex.10. | << 6syllAP-LHLHa >> | yEQi | EmEninIN | yEQarIR | miwEhAyo |
| phonological tone tier | LHa | LHa | L% | ||
| phonetic tone tier | L+H | L+ Ha | L L+ Ha | L +H L+ L% | |
| ‘Youngi’s | mom-TOP | Younga-ACC | hate’ | ||
| ‘Youngi’s | mom hates | Younga’ |
| Ex.11. | << 5syllAP-HHLHa >> | hyEQmininenIN | yEQarIR | miwEhAyo |
| phonological tone tier | LHa | LHa | L% | |
| phonetic tone tier | H +H L+ Ha | L Ha | L +H L% | |
| ‘Hyungmin’s family- TOP | Younga-ACC | hate' | ||
| -> ‘Hyungmin’s family | hates Younga’ |
| Ex.12. | << t1p1s2 >>-early | doQgi | bujEU | du | hjEQtA | zuQesE ... |
| phonological tone tier | LHa | LHa | LHa | L% | ||
| phonetic tone tier | L Ha | L L+Ha | L Ha | H +H | L+ L% | |
| ‘motivation | providing- POSS | two | types | among ...’ | ||
| -> ‘Among | the two types | which | provide | motivation,’ |
| Ex. 13. | << t1p2s8-1m >> | sEQzaQhago | iNnIN | gEsi | saraiNnIN | gEsida |
| phonological tone tier | LHa | LHa | L% | |||
| phonetic tone tier | H Ha | L | L+Ha | H+H | L+ L% | |
| ‘to grow-prog. | rel.cl. | thing- NOM | to live-prog. | thing-be’ | ||
| -> ‘Being growing | means | that it is | alive’ |
| Ex. 14. | << gazEQgyosa >> | nanIN | siRryEGiNnIN | zibaNU | gazEQgyosarIR | maNnaDTa. |
| phonological tone tier | LHa | LHa | LHa | L% | ||
| phonetic tone tier | L Ha |
H
Ha |
L+ Ha | L+H L+Ha | L L% | |
| ‘I-TOP | powerful | family’s | tutor-ACC . | met’ | ||
| -> ‘I | met the tutor | of | a powerful family’ |
3.4 The break index tier
Break indices represent the degree of juncture perceived between each pair of words and between the final word and the silence at the end of the utterance. They are to be marked after all words that have been transcribed in the word tier. All junctures -- including those after fragments and filled pauses -- must be assigned an explicit break index value; there is no default juncture type.
Break indices:
| 0 | : For cases of clear phonetic marks of “clitic” groups; e.g. application of vowel coalescence rules. Also for cases of ‘incomplete nouns’, monosyllabic nouns which are, though separated by spaces, not used by themselves but need a modifier (e.g. <su> ‘way’, <de> ‘place’, <gED> ‘thing’). |
| 1 | : For phrase-internal “word” boundaries which are not marked by such cliticization phenomena and can be pronounced by itself. |
| 2 | : For cases of a minimal phrasal disjuncture, with no strong subjective sense of pause -- that is, a sense of phrase edge of the type that is typically associated with the tonal pattern at the right edge of the Accentual Phrase. |
| 3 | : For cases of a strong phrasal disjuncture, with a strong subjective sense of pause (whether it be an objective visible pause or only the “virtual pause” cued by final lengthening) -- that is, a sense of phrase break of the type that is typically associated with the tonal pattern at the right edge of an Intonation Phrase. |
Note that while the Accentual
Phrase and Intonation Phrase are defined in the prosodic model by tonal
markings, the break index value indicates the labeler’s subjective
sense of disjuncture and not simply the juncture that typifies the
apparent tones. Thus, the break index tier markings are not made
completely redundant by the tone tier markings for break index levels 2
and 3. In cases of mismatch, the break index number should follow
the perceived juncture rather than the
tones, and it should be flagged with the diacritic “m”, as in:
| 1m | : A disjuncture that typically would correspond to a phrase medial word boundary, but is marked by the tonal pattern of an AP. |
| 2m | : A medium strength disjuncture that typically would be marked by the tonal pattern of the AP, but without any tonal markings, or with the tonal markings of an IP edge. |
| 3m | : A highest strength disjuncture that typically would be marked by the tonal pattern of the IP, but with the tonal markings of an AP. |
In an xwaves/xlabel type system or any system which allows time-aligned labels, the break index label should be aligned with a point in time at the end of each word, as indicated in the word tier. It should be located exactly at, or slightly to the right of, this word marker, so that break indices can be unambiguously associated with other tiers. Transcriber uncertainty about break-index strength is to be indicated with a minus (“-”) diacritic affixed directly to the right of the higher break index -- e.g. “1-” to indicate uncertainty between “0” and “1”; “2-” to indicate uncertainty between “1” and “2”; and so on. Note that since the “m” diacritic suggests certainty about the break index analysis in the face of conflicting tonal evidence, the “-” diacritic should not be used together with “m”.
For a case of uncertain or
underspecified break index labels, use the following labels at a break
index tier.
| x | : Underspecified break index |
| #- | : Break uncertain between # and #-1 level (ex. 2-: not sure of 2 or 1) |
| #p | : Pause or disfluency after this level of juncture; 1p for abrupt cutoffs after or in the middle of a word; 2p for prolongation of AP final syllable, but not meant to be an IP final. |
Example sentences with break
indices::
| Ex.12. | << t1p1s2 >>-early | doQgi | bujEU | du | hjEQtA | zuQesE ... |
| break index tier | 2 | 2 | 2- | 1 | 3- | |
| ‘motivation | providing- POSS | two | types | among ...’ | ||
| -> ‘Among | the two types | which | provide | motivation,’ |
| Ex.13. | << t1p2s8-1m >> | sEQzaQhago | iNnIN | gEsi | saraiNnIN | gEsida |
| break index tier | 1m | 0 | 2 | 1 | 3 | |
| ‘to grow-prog. | rel.cl. | thing-NOM | to live-prog | thing-be' | ||
| -> ‘Being growing | means | that it is alive’ |
| Ex.14. | << gazEQgyosa >> | nanIN | siRryEGiNnIN | zibaNU | gazEQgyosarIR | maNnaTTa. |
| break index tier | 2 | 1m | 2 | 2- | 3 | |
| ‘I-TOP | powerful | family’s | tutor- ACC | met’ | ||
| -> ‘I met | the tutor of a | powerful | family’ |
| Ex.15. | << t1p1s2 >>-late | iRbaNzEgiN | gEsIn | waNzEnhwa, | |
| break index tier | 1 | 3 | 3 | ||
| ‘general-rel | thing- TOP | completeness’ | |||
| ‘(Among the two types | which provide | motivation,) | what's in common | is completeness’ |
| Ex.16. | << break-L8C3 >> | azumEninga | ENze | maNdIrEjo? |
| break index tier | 2 | 1 | 3 | |
| ‘madam-NOM | when | make-Q’ | ||
| -> ‘When is | Madam | making (it)?’ |
| Ex.17. | << t1p2s6 >> | zIG, | saNhonIN | saraiSImjE | aMsEgIN | zugEiNnIn | gEsida |
| break index tier | 3 | 2 | 3 | 2- | 1 | 3 | |
| ‘That is, | coral-TOP | alive and | rock-TOP | dead-prog-rel. | thing-be’ | ||
| -> ‘That is, | coral is alive | and a rock is | dead’ |
| Ex.18. | << t1p2s10 >> | igEsIN | uridIR | maIMU | segyeedo | hAdaQdweNda. |
| break index tier | 3- | 2 | 2 | 2 | 3 | |
| ‘This our | our | mind's | world also | to apply to' | ||
| -> ‘This also | applies | to our | mind’ |
| Ex.19. | << t1p2s5 >>-early | gIrEna, | gatIN | hjENmigyEQe | sanho | zogagIR | noko | bomyEN |
| break index tier | 3- | 2 | 3 | 1 | 2- | 1 | 3 | |
| ‘but, | same | microscope-LOC | coral | piece-ACC | to put and | to see if’ | ||
| -> ‘But, | if you | see a piece of | coral | under the | same | microscope,...’ |
| Ex.20. | << t1p2s5 >>-late | saNhoga | sEQzaQhamyENsE | byENhwahago | iDTanIN | gEsIR | aR | Su | iDTa. |
| break index tier | 2 | 2 | 2 | 0 | 2 | 0 | 0 | 3 | |
| ‘coral-nom. | growing-while | changing | -prog.-rel | thing-ACC | to see | can' | |||
| ->‘We can | see that the coral is | changing | while | growing’ |
| Ex.21. | << coQgaG-HLH% >> | TaG | zikigo | iNniN | sarami | nuguNgohani | zERmIN | coQgaK | ANSoni | pakiNsINiMnida |
| break index tier | 3- | 1 | 2 | 2 | 2 | 2p | 2- | 3 | ||
| ‘firmly | guard | -PROG | man-NOM | who-is | young | bachelor | Anthony | Parkinson-be’ | ||
| -> | "The | man | who is | guarding | firmly is the | young | bachelor, | Anthony | Parkinson’ |
3.5 The miscellaneous tier
The miscellaneous tier will be used for any comments or markings (e.g., silence, audible breaths, laughter, disfluencies, and so on) desired by particular transcription groups. The only conventions K-ToBI specifies for this tier are that events that cover some clearly specifiable interval (such as breaths, silence or laughter) be labeled by the < .... > pair, aligned with both their temporal beginnings and ends. Event labels are written only before ‘>’.
< beginning of
an
interval (laughter)
laughter> end of a
period of laughter
Examples showing all tiers are
shown below. PL refers to a phonological tone tier and PT refers
to a phonetic tone tier. Break index is abbreviated as ‘BI’, and
miscellaneous tier as ‘misc’.
Ex.17. << t1p2s6 >>
| zIG, | saNhonIN | saraiSImjE | aMsEgIN | zugEiNnIn | gEsida | |
| PL | L% | LHa | L% | LHa | L% | |
| PT | H L% | H Ha | H+H L% | L Ha |
+H |
L+ L% |
| BI | 3 | 2 | 3 | 2- | 1 | 3 |
| misc | <Vdev> | |||||
| ‘That is, | coral-TOP | alive and | rock-TOP | dead-prog-rel | to be’ | |
| -> ‘That is, | coral is alive | and a rock is | dead’ |
Ex.21. << coQgaG-HLH% >>
| TaG | zikigo | iNniN | sarami | nuguNgohani | zERmIN | coQgaG | ANSoni | pakiNsINiMnida | ||
| PL | H% | LHa | LHa | LHa | LHa | HLH% | ||||
| PT | L H% | +H | L+Ha | L L+Ha | L Ha | L Ha | L+ H | L+ HLH% | ||
| BI | 3- | 1 | 2 | 2 | 2 | 2p | 2- | 3 | ||
| misc | <Vdev> | <sil> | ||||||||
| ‘firmly | guard | -PROG | man-NOM | who-is | young | bachelor | Anthony | Parkinson-be’ | ||
| -> | ‘The man | who is | guarding | firmly is the | young | bachelor, | Anthony | Parkinson’ |
Ex.22. << millennium >>-early
| yozIM | gIrEN | gyohwega | i- | icENnyENi | miRreniEmi | |
| PL | LHa | LHa | LHa | LHa | H% | |
| PT | L Ha | L Ha | L L+Ha | L+H L+ La | L+H L+ H% | |
| BI | 2 | 2 | 2 | 2 | 3- | |
| misc | <disfl> | |||||
| ‘These days | that | church-NOM , | eh, | Year 2000-NOM | millennium-NOM’ | |
| -> ‘These days, | that | kind of | church | eh, Year 2000, | millennium….’ |
Ex.23. << millennium >>-middle
| ize | nAnyENbutE | (ne) | sizagi | dwegu | |
| PL | LHa | LH% | LHa | HL% | |
| PT | H La | L+H LH% | H Ha | L>HL% | |
| BI | 1m | 3 | 2- | 3 | |
| misc | <other spkr> | ||||
| ‘now | next year-from | (yes) | beginning-NOM | become' | |
| -> | ‘Now, (it will) | start from next | year (Yes) …’ |
Ex.24. << millennium >>-final
| usEN | manIN | gyohwe(do) | ceiNziga | dweNda | gIreyo. | (ne) | |
| PL | LHa | LHa | LHa | HL% | L% | ||
| PT | L Ha | L Ha | L La | H+H | L+ HL% | H L% | |
| BI | 2- | 2- | 2 | 1 | 1 | 3 | 3 |
| misc | <other spkr> | ||||||
| ‘First of all | many | church (too) | change-NOM | become | they say’ | (yes) | |
| -> | ‘They | say, first of all | many churches | will | change | too (Yes)’ |
4. Online Data Files and Future Versions
All examples (sound file, f0 track, and labels) shown in this manual can be accessed in the Sun workstation in the Phonetics Lab of the UCLA Department of Linguistics. This directory includes more examples, some labeled and some not, for labelers to practice transcribing the K-ToBI system. As more speech data become available, these labeling guidelines may be further refined. To get speech files and label files mentioned in this paper, contact jun@humnet.ucla.edu. This and earlier versions of K-ToBI manual are available on the author’s web site (http://www.linguistics.ucla.edu/people/jun/sunah.htm), and also on UCLA Phonetics Lab web site (http://www.linguistics.ucla.edu/ faciliti/uclaplab.html).
References
Beckman, Mary & Hirschberg, Julia (1994) "The ToBI Annotation Conventions", Manuscript, Ohio State University.
Beckman, Mary & Jun, Sun-Ah (1996) "K-ToBI (Korean ToBI) Labelling Convention" Version 2. Manuscript. Ohio State University and UCLA. Manuscript is available at [http://www.linguistics.ucla.edu/people/jun/sunah.htm.]
Beckman, Mary & Pierrehumbert, Janet (1986) "Intonational Structure in Japanese and English", Phonology Yearbook 3:255-309.
Campbell, Nick & Venditti, Jennifer (1995) "J-ToBI: an intonational labeling system for Japanese," Paper presented at the Autumn meeting of the Acoustical Society of Japan.
de Jong, Kenneth (1989) "Initial tones and prominence in Seoul Korean," a paper presented at the 117th meeting of the Acoustical Society of America, Syracuse, N.Y.; A paper published in the Ohio State University Working Papers in Linguistics, No. 43, pp. 1-14 (1994).
Jun, Sun-Ah (1989) "The Accentual Pattern and Prosody of Chonnam Dialect of Korean," in S. Kuno et al. (eds.) Harvard Studies in Korean Linguistics III. pp. 89-100. Harvard Univ. Cambridge, Mass.
Jun, Sun-Ah (1990) "The prosodic structure of Korean -- in terms of voicing," In E-J. Baek, ed., Proceedings of the 7th International Conference on Korean Linguistics, pp. 87-104. University of Toronto Press.
Jun, Sun-Ah (1993) The Phonetics and Phonology of Korean Prosody. Ph.D. Dissertation, the Ohio State University. [Published in 1996 by Garland, New York]
Jun, Sun-Ah (1996) “Influence of microprosody on macroprosody: a case of phrase initial strengthening”, UCLA Working Papers in Phonetics 92: 97-116
Jun, Sun-Ah (1998) “The Accentual Phrase in the Korean prosodic hierarchy”, Phonology. 15.2: 189-226
Jun, Sun-Ah & Oh, Mira (1996) "A prosodic analysis of three types of Wh-phrases in Korean", Language and Speech 39(1):37-61.
Lee, Hyuck-Joon (1999) Tonal Realization and Implementation of the Accentual Phrase in Seoul Korean. MA thesis, UCLA.
Lee, Sook-hyang (1989) "Intonational domains of the Seoul dialect of Korean," a paper presented at the 117th meeting of the Acoustical Society of America, Syracuse, N.Y.; An abstract in Journal of the Acoustical Society of America, vol. 85, suppl. 1, p. S99.
Park, Mee-Jeong (2000) “Where prosody meets grammar: Taxonomy of Korean prosodic boundary tones”, ms. UCLA.
Pierrehumbert, Janet (1980) The Phonology and Phonetics of English Intonation, Ph.D. dissertation, MIT.
Pierrehumbert, Janet & Beckman, Mary (1988) Japanese Tone Structure, MIT Press.
Pitrelli, John; Beckman, Mary; & Hirschberg, Julia (1994) "Evaluation of prosodic transcription labeling reliability in the ToBI framework," Proceedings of the 1992 International Conference on Spoken Language Processing, vol. 1, pp. 123-126.
Schafer, Amy & Jun, Sun-Ah (submitted) “Effects of Accentual Phrasing on Adjective Interpretation in Korean”, in M. Nakayama (ed.), East Asian Language Processing, Stanford, CSLI. [Proceedings of the International East Asian Psycholinguistics Workshop], August, 1999, Columbus.
Silverman, Kim; Beckman, Mary; Pitrelli, John; Ostendorf, Mari; Wightman, Colin; Price, Patti; Pierrehumbert, Janet; & Hirschberg, Julia (1992) "ToBI: a standard for labeling English prosody," Proceedings of the 1992 International Conference on Spoken Language Processing, vol. 2, pp. 867-870.
Venditti, Jennifer (1995)
Japanese ToBI Labeling Guidelines. Manuscript with examples, Ohio State
University. [For information on obtaining by ftp, send e-mail to
venditti@ling.ohio-state.edu.]
Appendix A:
Romanization Convention
1. Consonants
|
2. Vowels
|
Appendix B: Pitch Tracks of Examples Given in the Paper
Pitch tracks and labels are made using PitchWorks (Scicon).
A word tier is labelled as 'words',
a phonological tone tier as 'Utones' and a phonetic tone tier as
'Stones', a break index
tier as 'break', and a miscellaneous tier as 'misc'. The number given
in each graph
matches that in the main text.
In figures #1-4 below, the vertical line marking the beginning of
the last syllable, '-yo'
[jo], is drawn before the line marking the boundary tone or '>'.
This is to show the
difference in Fo rise timing between H% and LH% and between HL% and
LHL%.
1. <<
boundary-H% >> 'Is that
so?'
2. <<
boundary-LH% >> 'Is that so?'
3. <<
boundary-HL% >> 'Is that
so?' 4.
<<
boundary-LHL% >> 'Is that so?'
5. << J3A2-HLH% >> 'Who is
eating tonight?'
6. <<
IPboundary-HL% >> 'Wind and the
Sun' 7. <<
IPboundary-LH% >> 'Second,'
8. << 2syllAP-LHa
>> 'I hate Younga'
9. << 5syllAP-LHLHa
>> 'Youngmi's family hates Younga'
10. << 6syllAP-LHLHa
>> 'Youngi's mom hates Younga'
11. << 5syllAP-HHLHa
>> 'Hyungmin's family hates Younga'
12. << t1p1s2
>>-early 'Among the two types which provide
motivation,'
13. << t1p2s8-1m
>> 'Being growing means that it is alive'

14. << gazEQgyosa
>> 'I met the tutor of a powerful family'
15. << t1p1s2 >>-late
'(Among the two types which provide motivation,) what's in common is
completeness'
16. << break-L8c3
>> 'When is Madam making (it)?'
17. << t1p2s6
>> 'That is, coral is alive and a rock is
dead'
18. << t1p2s10
>> 'This also applies to our mind'
19. << t1p2s5
>>-early 'But, if you see a piece of coral
under the same microscope,...'
20. << t1p2s5
>>-late 'We can see that coral is changing
while growing'
21. << coQgaG-HLH% >>
'The man who is guarding firmly is the
young bachelor, Anthony Parkinson'
22. << millennium
>>-early
'Thesedays, that kind of church, eh, Year 2000, millennium....'
23. << millennium >>-middle
'Now, (it will) start from next year ... (Yes)'
24. << millennium
>>-final 'They say, first of all, many churches will
change too (Yes)'
This version is published in UCLA Working
Papers in Phonetics 99. A slightly earlier version (V. 3.0) is
published in Speech Sciences, Vol.7, No.1 pp.143-170.
You can download this paper (in pdf ).
Links to the previous version of K-ToBI ( Version 2 )
Links to the ToBI
homepage
-- links to English ToBI an