Principles and Methods of Digital Lexicography
- Authors: Makarov Y.Y.1,2,3
-
Affiliations:
- V.V. Vinogradov Russian Language Institute of the Russian Academy of Sciences
- Institute of Linguistics of the Russian Academy of Sciences
- National Research University “Higher School of Economics”
- Issue: Vol 83, No 4 (2024)
- Pages: 102-112
- Section: Articles
- URL: https://ruspoj.com/1605-7880/article/view/657015
- DOI: https://doi.org/10.31857/S1605788024040106
- ID: 657015
Cite item
Abstract
The article describes the principles and methods of digital lexicography. It begins by defining the four main stages of the lexicographic process: 1) writing up the dictionary, 2) editing and developing the book layout, 3) publishing, and 4) the post-publication period. The following section focuses on stage 1, comparing the compilation of example corpora for dictionary preparation in the past (using millions of cardboard cards) with modern tools for lexical analysis provided by web corpora like the Russian National Corpus (ruscorpora.ru). The overview of the advancements in finding examples illustrating word usage is followed by an exploration of the ways dictionary writing methods have evolved.
The analysis of computer-based dictionary writing methods starts with a discussion of the two most popular approaches: file-based and tabular. The former involves composing dictionary files with thousands of entries using text editors like Microsoft Word, resulting in poorly structured entries with inconsistent markup. The latter, however, represents each entry as a raw with entry zones (forms, meanings, examples, etc.) arranged in separate columns. The section outlines the challenges of these methods, emphasizing their limitations in publishing options and handling complex linguistic data, often employing many-to-one relationships. Alternatives such as Text Encoding Initiative (TEI) formats and database utilization are discussed, highlighting their capacity for structured data representation.
Subsequently, dictionary writing systems (DWS) are introduced, with the OnLex platform serving as a primary example illustrating their functionality. It demonstrates how online editing interfaces streamline lexicographic processes, from data input to publication and feedback collection. By analyzing DWS features, the article emphasizes their efficacy in simplifying the editorial workflow and enhancing user experience.
A critical appraisal of the advantages of online DWS is provided, highlighting their role in addressing key challenges faced by traditional publishing methods. Notable advantages include seamless integration of search functionalities, support for multiple languages, and real-time error reporting mechanisms after publication.
In conclusion, the article advocates for the wider adoption of digital lexicography methods, particularly within the Russian tradition, emphasizing their potential to facilitate every stage of the dictionary creation process.
Full Text

About the authors
Yu. Yu. Makarov
V.V. Vinogradov Russian Language Institute of the Russian Academy of Sciences; Institute of Linguistics of the Russian Academy of Sciences; National Research University “Higher School of Economics”
Author for correspondence.
Email: yurmak@iling-ran.ru
Research Fellow at the V.V. Vinogradov Russian Language Institute of the Russian Academy of Sciences, Junior Researcher at the Institute of Linguistics of the Russian Academy of Sciences, Visiting Scholar at the National Research University “Higher School of Economics”
Russian Federation, Moscow; Moscow; MoscowReferences
- Belyaev, O.I., Makarov, Y., Novokshanov, D.A., Sinitsyna, Ju.V., Khomchenkova, I.A. Onlajn-slovari iranskikh jazykov [Online Dictionaries of Iranian Languages]. 1-aja Mezhdunarodnaja nauchno-obrazovatelnaja konferentsija “Pejsikovskie chtenija: problemy sovremennogo akademicheskogo vostokovedenija”: materialy konferentsii [1st International Scientific and Educational Conference “Peisikov Readings: Problems of Modern Academic Oriental Studies”: Conference Materials]. Ed. A.A. Maslov. Moscow: ISAA MGU imeni M.V. Lomonosova Publ., 2023, pp. 7–11. URL https://elibrary.ru/item.asp?id=58073241&pff=1 (In Russ.)
- Belyaev, O.I., Khomchenkova, I.A., Sinitsyna, J.V., Dyachkov, V.V., Byzova, A.A., Badeev, A.O., Alekseev, D.A., Makarov, Y. Istoriko-etimologicheskij slovar osetinskogo jazyka V.I. Abaeva: problemy sozdanija tsifrovoj dvujazychnoj versii [V.I. Abaev’s Historical-Etymological Dictionary: Issues in the Development of a Digital Bilingual Edition]. Vestn. Mosk. un-ta. Seriya 9. Filologiya [Lomonosov Philology Journal. Series 9. Philology]. 2024, No. 2, pp. 75–86. (In Russ.) http://dx.doi.org/10.55959/MSU0130-0075-9-2024-47-02-4
- Dragićević, R., Makarov, Y., Ryzhova, D., Shapich, Y., Yakushkina, E. A new bilingual Serbian–Russian dictionary. (Eds.) K. Despot, I. Brač, A. Ostroški Anić. Lexicography and Semantics: Proceedings of the XXI EURALEX International Congress. Zagreb: Institute for the Croatian Language, 2024, рр. 93–100.
- Plungian, V. A. Korpus kak instrument i kak ideologija: o nekotorykh urokakh sovremennoj korpusnoj lingvistiki [A Corpus as a Research Tool and Ideology: Some Lessons from Modern Corpus Linguistics]. Russkij jazyk v nauchnom osveshchenii [Russian Language and Linguistic Theory]. 2008, No. 16(2), pp. 7–20. (In Russ.)
- Belikov, V.I., Kopylov, N.Ju., Piperski, A.Ch., Selegey, V.P., Sharoff, S.A. Korpus kak yazyk: ot masshtabiruemosti k differencialnoj polnote [Corpus as Language: From Scalability to Register Variation]. Kompiuternaia lingvistika i intellektualnye tekhnologii [Computational Linguistics and Intelligent Technologies]. 2013, No. 12(1), p. 19. (In Russ.)
- Piperski, A., Belikov, V., Kopylov, N., Morozov, E., Selegey, V., Monakhov, S. Big and diverse is beautiful: A large corpus of Russian to study linguistic variation. (Eds.) S. Evert, E. Stemle, P. Rayson. Proceedings of the 8th Web as Corpus Workshop (WAC-8) @ Corpus Linguistics 2013. 2013. P. 24–28.
- Magomedgazhieva, P., Daniel, M. Dictionary of Tukita (v2.0.0). Linguistic Convergence Laboratory, HSE University, Moscow, 2023. https://doi.org/10.5281/zenodo.7803955
- Belyaev, O., Khomchenkova, I., Sinitsyna, J., Djachkov, V. Digitizing print dictionaries using TEI: The Abaev Dictionary Project. Proceedings of the Seventh International Workshop on Computational Linguistics of Uralic Languages, Syktyvkar, Russia (Online): Association for Computational Linguistics. 2021. P. 57–64. URL: https://aclanthology.org/2021.iwclul-1.7
- Ivanov, V.B. Bolshoj persidsko-russkij slovar [Persian-Russian Dictionary]. Vol. 1. Moscow: Nauka Publ., 2020. (In Russ.)
- Abel, A. Dictionary writing systems and beyond. Electronic Lexicography. (Eds.) S. Granger, M. Paquot. Oxford University Press, 2012. P. 83–106. https://doi.org/10.1093/acprof:oso/9780199654864.003.0005
- Makarov, Y., Melenchenko, M., Novokshanov, D. Digital Resources for the Shughni Language. Proceedings of The Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia within the 13th Language Resources and Evaluation Conference, Marseille, France: European Language Resources Association, 2022. P. 61–64. URL: https://aclanthology.org/2022.eurali-1.9
- Ivanov, V.B. Bolshoj persidsko-russkij slovar [Persian-Russian Dictionary]. Vol. 2. Moscow: Fond Ibn Siny Publ., 2023. (In Russ.)
- Ivanov, V.B. Bolshoj persidsko-russkij slovar [Persian-Russian Dictionary]. Vol. 3. Moscow: OOO “Sadra” Publ., 2024. (In Russ.)
- Krysin, L.P. (ed.) Akademicheskij tolkovyj slovar russkogo jazyka. Tom 1: A – VILIAT’ [Academic Explanatory Dictionary of Russian. Vol. 1]. Moscow: Izdatelskij dom IASK Publ., 2016. (In Russ.)
- Krysin, L.P. (ed.) Akademicheskij tolkovyj slovar russkogo jazyka. Tom 2: VINA – GIAUR [Academic Explanatory dictionary of Russian. Vol. 2]. Moscow: Izdatelskij dom IASK Publ., 2016. (In Russ.)
- Tsumarev, A.E., Shestakova, L.L., Nechaeva, I.V., Kuleva, A.S., Grunchenko, O.M. “Akademicheskij tolkovyj slovar russkogo jazyka”: traditsionnoe i novoe [“Academic Explanatory Dictionary of the Russian Language”: the Traditional and the New]. Izvestiâ Rossijskoj akademii nauk. Seriâ literatury i âzyka [Bulletin of the Russian Academy of Sciences: Studies in Literature and Language]. 2017, Vol. 76, No. 5, pp. 5–21. (In Russ.)
Supplementary files
