From the perspective of the corpus, each document in the corpus is a sequence of words, and each word has three attributes:
EXAMPLE
There are five words in the sentence “beidh freagra do cheiste agat” and these are the attributes they have:
Form:
beidh
Lemma:bí
Tag:Vmif
- Explanation: verb (
Vm
), indicative (i
), future tense (f
)Form:
freagra
Lemma:freagra
Tag:Ncmsc
- Explanation: common noun (
Nc
), masculine (m
), singular (s
), in the nominative case (c
)Form:
do
Lemma:do
Tag:Dp2-s
- Explanation: possessive determiner (
Dp
), second person (2
), singular (s
)Form:
cheiste
Lemma:ceist
Tag:Ncfsg
- Explanation: common noun (
Nc
), feminine (f
), singular (s
), in the genitive case (g
)Form:
agat
Lemma:ag
Tag:Pr2-s
- Explanation: prepositional pronoun (
Pr
), second person (2
), singular (s
)
The attributes of each word in the corpus can be seen by clicking on it, which will open the word information box: the attributes will be listed there. To close the box click on the same word again.
A word's lemma is one of the forms of the word which is thought to be the base form. In the case of nouns, for example, the nominative singular counts as the lemma, and in the case of verbs the imperative singular counts as the lemma.
The concept of lemma helps us find words in the corpus regardless of their form. On this site, if you do a broad search for a particular lemma, for example [`crann'](/en/cng/?q=crann), you will find all words belonging under that that lemma: `crainn`, `chrainn`, `gcrann` and, of course, `crann` itself.
It is important to remember that lemmas have been assigned to the words in the corpus with an automatic computer program, which means that the result is not necessarily 100% accurate. Sometimes, you will see words with the wrong lemma mentioned. However, the vast majority of lemmas in the corpus are thought to be correct.
The grammar tag is a string of characters that tells which part of speech the word is (noun, verb, adjective...) plus other grammatical properties.
Each tag starts with a capital letter that tells which part of speech it is in: N
for a noun, V
for a verb, A
for an adjective, and so on. The other letters after that give more information about the word: which number, which tense, which case, which person and so on.
EXAMPLE: NOUN
Here's how to read the grammar tags of nouns. A noun's tag always contains up to six characters.
In the first place is the letter
N
which indicates that it is a noun.Second is a letter that tells what kind of noun it is:
v
if it is a verbal noun,p
if it is a proper noun and so on. Most nouns have ac
here which stands for common noun.Third is the gender of the noun: feminine
f
or masculinem
.Fourth is the number of the noun: singular
s
or pluralp
.In fifth place is the case in which there is a noun:
c
if nominative,g
if genitive,v
if vocative,d
if dative.Sixth, if the letter
e
is here, it tells that it is an emphasized noun (for example “mo theachsa).Now, as an exercise, what kind of noun is
Ncfsg
? Can you "read" the tag now?
This system of grammatical tags is based on an international schema developed in the 1990s for European languages under the PAROLE project. This schema is called “PAROLE Tag Set”.
This is a very detailed system, making it difficult for humans to read and understand the tags. To make that easier, an application called Grammar tag explainer is available on this site which will let you "translate" any grammar tag into terms that would be more understandable to someone who has experience with the grammatical terminology of Irish.
Wherever a grammar tag appears on this site – including the word information box – you will find a link to the Explainer beside to the tag, as a quick way to explain what the tag means.