Texts processing

Over all the lexicographic pieces included in this corpus, specially on the older ones, changes have been made, affecting both, the Mapudungun and Spanish texts. Regarding the Mapuche ones, three big changes have been performed, being the first of them the normalization of the orthography, for this purpose it was mostly used the Unified Mapuche Alphabet (AMU), with the exception of two representations: interdental phonemes <t>, <n> and <l>, and the lateral dentoalveolar (<l>) in some contexts. In the first case, Zúñiga's (2010) proposal was adopted, using the postposed apostrophe in this way: <t'>, <n'> and <l'>. Such choice was taken under the agreement on the fact that "the predominant use of underlining in languages written with the Latin alphabet, as the Internet standards (where is highly frequent the underlining of links) are against the use of the graphemes <t>, <n> and <l> for the representation of the dental series as suggested by the AMU" (Zúñiga, 2010, p. 272). In the second case, the hyphen was used to separate two successive dentoalveolars (<l-l>) found in contexts were they could be taken by the lateral aveopalatal, changing the sense of the words in some occasions; f. ex., elul-lafiñ 'I didn't give it to him for his own benefit' vs. elullafiñ 'I will give it to him and wait to see what happens' (Zúñiga, 2010, p. 269).

At the table on the right there is an inventory of Mapudungun phonemes together with their respective allophones and the graphemes used by Valdivia (1606), Febrés ([1765] 1882, 1765, 1846a, 1846b), Augusta (1916a, 1916b), and for the CORLEXIM to represent them1.

The second modification is about the inclusion of the sixth vowel <ü> (between square brackets) where the lexicographers missionaries have omitted them2. And the third one is the correction of the spelling errors found at the originals.

Regarding the Spanish texts, three great modifications were also carried out: normalization of the orthography, correction of the spelling errors at the originals and updating of archaic words; in this last task the Diccionario de la Lengua Española (Real Academia Española, 2001) and the Corpus de Referencia del Español Actual (Real Academia Española) were consulted.

Also, in our version of the Augusta (1916b) dictionary, we have replaced the vowels marked with tilde in the original by bold characters.

All of these modifications intend to facilitate the searches a user can perform on this corpus.


1 The only modification that hasn't been carried out yet (because of the huge amount of material and the impossibility in making the changes totally automatic) is the writing of the allophones [u̯] and [i̯] as /w/ and /y/, respectively, in Augusta (1916a and b).

2 Up to this moment this modification has been performed only in Augusta (1916b).

Tasks in progress

We are currently working on the original versions of the dictionaries contained in this corpus, with the intention to include them together with the modernized versions. And we also remain working on the writing of the allophones [u̯] and [i̯] as /w/ and /y/, respectively, in Augusta (1916a and b), and on the inclusion of the sixth vowel on the works by Valdivia (1606), Febrés ([1765] 1882, 1765b, 1846a and b), Augusta (1916a).

Tips to search

We suggest the users of this corpus to type for search -specially in Augusta (1916a and b)- roots of Mapudungun words but no complete words.