On the
topmost level, CROVALLEX 2.008 is divided into word entries. Each word entry
relates to one or more headword lemmas (Sec. 2) . The word entry
consists of a sequence of frame entries (Sec. 5) relevant for the
lemma(s) in question (where each frame entry usually corresponds to one of the
lemma's meanings). Information about the aspect (Sec. 14) of the lemma(s) is
assigned to each word entry as a whole.
Most of
the word entries correspond to lemmas in a simple one-to-one manner, but the
following two non-trivial situations appear as well in CROVALLEX 2.008:
The
content of a word entry roughly corresponds to the traditional term of lexeme.
Verb
lemma represents the infinitive form of the verb, which is in case of lexical
homonyms (Sec. 3) and
homographs (Sec. 4) followed
by a Roman number in superscript.
Reflexive
particle se is part of the infinitive only if the verb is derived
reflexive (e.g. vratiti se) or reflexiva tantum (e.g. penjati se).
Lexical
homonyms are groups of two lemmas which have the same spelling and wordform,
but considerably differ in their meanings (there is no obvious semantic
relation between them). They also might differ as to their etymology (e.g. hȉtatiI - žuriti vs. hȉtatiII - bacati), aspect
(Sec. 14)
(e.g. matíratiI inf. - činiti da što ne dobije neželjen
sjaj vs. matíratiII fin.-poraziti), or conjugated forms (izvezem [first person
sg.] for izvestiI - okititi vezom vs. izvesti [first person sg.]
for izvestiII - izvoziti).
The term
'lexical homonyms' should not be confused with the term 'synonymy'.
Homographs
are groups of two lemmas which have the same wordform, but different accent,
and also considerably differ in their meanings (there is no obvious semantic
relation between them). They also might differ as to their etymology (e.g. ìskapatiI - isteći kapljući
vs. iskápatiII - iskopavati),
aspect (Sec. 14)
(e.g. ìsplakatiI fin.-suzama, plačem izraziti vs. isplákatiII inf.-ispirati),
or conjugated forms (napadnem [first person sg.] for nàpastiI - ugroziti
tjelesnu sigurnost vs. napasem [first person sg.] for nàpāstiII - pasući
nahraniti).
The term
'homographs' should also not be confused with the term 'synonymy'.
Each word
entry (Sec. 1)
consists of a non-empty sequence of frame entries, typically corresponding to
the individual meanings (senses) of the headword lemma(s) (from this point of
view, CROVALLEX 2.008 can be classified as a Sense Enumerated Lexicon).
The frame
entries are numbered within each word entry; in the CROVALLEX 2.008 notation,
the frame numbers are attached to the lemmas as subscripts.
The
ordering of frames is not completely random, but it is not perfectly systematic
either. So far it is based only on the following weak intuition: primary and/or
the most frequent meanings should go first, whereas rare and/or idiomatic
meanings should go last. (We do not guarantee that the ordering of meanings in
this version of CROVALLEX 2.008 exactly matches their frequency of the
occurrences in contemporary language.)
Each
frame entry contains a description of the valence frame itself (Sec. 6)
and of the frame attributes (Sec. 12) .
The content
of 'frame entry' roughly corresponds to the term of lexical unit.
In
CROVALLEX 2.008, a valence frame is modeled as a sequence of frame slots. Each
frame slot corresponds to one (either required or specifically permitted)
complementation of the given verb.
The
following attributes are assigned to each slot:
In
CROVALLEX 2.008, functors (labels of 'deep roles'; similar to theta-roles) are
used for expressing types of relations between verbs and their
complementations. Functors are divided into inner participants (actants)
and free modifications (this division roughly corresponds to the
argument/adjunct dichotomy).
Functors
which occur in CROVALLEX 2.008 are listed in the following tables:
Inner
participants:
Functor |
Example sentence |
AGT (agent) |
John reads the book. |
PAT (patient) |
John plays the piano. |
REC (recipient) |
My mother sent her the money. |
RESL (result) |
His hard work took him to he victory. |
ORIG (origin) |
We received the message from the dean. |
Free
modifications:
Functor |
Example sentence |
ACMP (accompaniement) |
My sister visited me with her husband. |
AIM (aim) |
He left the school to join the army. |
BEN (benefactive) |
My mother made a cake for me. |
CAUS (cause) |
My father got angry because I failed the exam. |
CNCS (concession) |
She still loves him although he lied. |
COMPL (complement) |
I was sailing the seas as a young researcher. |
COND (condition) |
I will give you my book if you promise not to lose it. |
CONTR (contra) |
Tomorrow he plays against the tennis player from |
CPR (comparison) |
You will have to study more than you did last time. |
DIR1 (direction-from) |
My mother just came from the theater. |
DIR2 (direction-through) |
She drove through the town. |
DIR3 (direction-to) |
My mother went to the shop. |
EXT (extent) |
The snow has risen over half a meter. |
HER (heritage) |
They named the boat after the great sailor. |
LOC (locative) |
My sister lives in |
MANN (manner) |
She lost the interest in reading very quickly. |
INST (instrument) |
She sent her the news by email. |
DIFF (difference) |
The stock prices have risen by about 30%. |
OBST(obstacle) |
My granny tripped over her toys. |
REG (regard) |
Regardless of her beauty she still
has no boyfriend. |
RESTR (restriction) |
She will make the lunch for all except John. |
SUBS (substitution) |
Your boy went to the playground instead of going to
school. |
TFRWH (temporal-from-when) |
I remember her being smart from the highschool. |
THL (temporal-how-long ) |
She stayed for her holidays in |
THO (temporal-how-often ) |
She plays the guitar every Saturday. |
TOWH (temporal-to when) |
The teacher postponed the exam to June 11. |
TSIN (temporal-since-when) |
She didn't study since the last semester. |
TWHEN (temporal-when) |
She visited us last summer. |
In a sentence,
each frame slot can be expressed by a limited set of morphemic means, which we
call forms. In CROVALLEX 2.008, the set of possible forms is defined either
explicitly (Sec. 9) ,
or implicitly (Sec. 10) .
In the former case, the forms are enumerated in a list attached to the given
slot. In the latter case, no such list is specified, because the set of
possible forms is implied by the functor of the respective slot (in other
words, all forms possibly expressing the given functor may appear).
The list
of forms attached to a frame slot may contain values of the following types:
If no
forms are listed explicitly for a frame slot, then the list of possible forms
implicitly results from the functor of the slot according to the following (yet
incomplete) table:
LOC |
blizu+2, kod+2, u+6, na+6 ... |
MANN |
adverb, sukladno+2, poput+2 ... |
DIR3 |
na+4,u+4 ... |
DIR1 |
s+2, od+2, iz+2 ... |
DIR2 |
adverb, 7, kroz+4, oko+2, po+6 ... |
TWHEN |
adverb, pred+4, za+4, oko+2,
tijekom+2, u+4, nakon+2 ... |
THL |
adverb, 7, indeclinabilia+2 ... |
EXT |
adverb, 4, za+4, indeclinabilia+2 ... |
REG |
za+4, u+6, prema+6... |
TFRWH |
s+2 ... |
AIM |
za+4, na+4, da bi, da ... |
TOWH |
na+4, za+4 ... |
TSIN |
od+2, adv ... |
THL |
adv,7 ... |
INST |
7, s+indeclinabilia+2 ... |
CAUS |
7, od+2, zbog+2, jer ... |
In
CROVALLEX 2.008, valence frames consist of inner participants and free
modifications that are both obligatory and non-obligatory but typical ('obl'
and 'typ' for short). Typical inner participants and free modifications are
those that are typically ('typ') related to some verbs (or even to whole
classes of them) and not to others.
The
attribute 'type' is attached to each frame slot and can have one of the
following values: 'obl' or 'typ' for both inner participants and free
modifications.
In
CROVALLEX 2.008, frame attributes (more exactly, attribute-value pairs) are
either obligatory or optional. The obligatory attributes have to be filled in
every frame. The optional attributes might be empty, usually because they are
not applicable.
Obligatory
frame attributes:
Optional
frame attributes:
Some
frames are assigned semantic classes like 'motion', 'transport', 'push',
'meet', 'manner of expression', 'eat', etc. In CROVALLEX 2.008 there are 173
syntactic-semantic classes (more accurately, 72 classes with tho further levels
of subdivision). Those classes have been derived from VerbNet (a verb lexicon
based on Levin’s verb classes which also provides selectional restrictions
attached to semantic roles) and specially refined and modified for Croatian
language. This classification is still tentative and should not be as a
properly defined ontology.
The
motivation for introducing such semantic classification in CROVALLEX 2.008 was
the fact that it simplifies systematic checking of consistency and allows for
making more general observations about the data.
Perfective
verbs (in CROVALLEX 2.008 marked as 'fin.' for short) and imperfective verbs
(marked as 'inf.') are distinguished between in Croatian; this characteristic
is called aspect. In CROVALLEX 2.008, the value of aspect is attached to each
word entry as a whole (i.e., it is the same for all its frames and it is shared
by the homographs, if any).
Some
verbs (i.e. analizirati, bombardirati) can be used in
different contexts either as perfective or as imperfective (in CROVALLEX 2.008
marked as 'dual.' for short).
The focus
in CROVALLEX 2.008, is mainly on primary or usual meanings of verbs. But, many
frames also correspond to peripheral usages of verbs - these are idiomatic
frames with label 'idiom'. An idiomatic frame is tentatively characterized
either by a substantial shift in meaning (with respect to the primary sense),
or by a small and strictly limited set of possible lexical values in one of its
complementations.