VI INTERNATIONAL CONFERENCE "LANGUAGE AND MODERN TECHNOLOGIES" | Tbilisi, Georgia | 2022 | 19-20 სექტემბერი | Ivane Javakhishvili Tbilisi State University, Arnold Chikobava Institute of Linguistics, State Language Department of Georgia, Internet Society – Georgia Chapter | Matching the morphological characteristics of Georgian and English in the grammatical dictionary | oral | The report is about the compiler of the Georgian-English grammar dictionary, which is being created within the framework of the grant project of the Shota Rustaveli Georgian National Science Foundation. The aim of the project is to ensure the compliance of the main components of the Georgian grammar dictionary with international standards and to develop a Georgian-English grammar dictionary compiler. The purpose of the electronic grammatical dictionary is to provide information on the morphological and syntactic characteristics of the dictionary unit, which are essential for structuring correct grammatical phrases. These types of dictionaries are used as a tool of automated morphological analysis while processing texts. We select the EAGLES (Expert Advisory Group on Language Engineering Standards) standard when selecting morphological characteristics. Morpho-syntactic parameters and their markers have been developed within the framework of the literary Georgian language corpus. The system software includes the ability to add both characteristics and their markers. When compiling a grammar dictionary of any subsystem of the Georgian language, the linguist-user has the opportunity to add a grammatical characteristic with the appropriate marker. At present, English language characteristics are being added to the GEGDiCo system and brought in compliance with the Georgian language. The report will discuss the development and systematization of classification characteristics for flectional and derivational forms of Georgian and English nouns, adjectives and numerals, pronouns, verbs and adverbs in the grammar dictionary. | https://ice.ge/of/?page_id=5791 |
International Conference on Computational Linguistics and Intellectual Technologies | Protvino, Russia | 2002 | 6-11 ივნისი | International Seminar "Dialogue 2002" | Automatic Russian spelling dictionary | oral | Principal scheme of a system, generating word paradigm proceeding from its basic (dictionary) form, is proposed. The system is represented by morphologic nets, which were earlier successfully used for bi-directional (synthesis/analysis) morphologic processors and for morphologic tagger without dictionary. The system under consideration uses instead of full range dictionary list of basic forms, which is structured in the way, that belonging to some of its sub-lists is sufficient for definition of all characteristics necessary for building of corresponding paradigm. The system is tested on Russian verb paradigms. Being accomplished it should serve as automatic Russian spelling dictionary. | https://www.dialog-21.ru/digest/2002/articles/chikoidze/ |
Natural Language Processing The Georgian Language and Computer Technologies | Tbilisi, Georgia | 2004 | 21-23 ივნისი | Arnold Chikobava Institute of Linguistics | Electronic Version of the Georgian Language Mophological Glossary | oral | Electronic version of the glossary is based on electronic systems of Georgian Language synthesis and analysis. It represents the unity of the following processors: Dictionary of roots, Identification rules of the root information (phonetical, morphological, in a number of cases semantic codes of a stem). Rules reflecting root changes. Listed processors are integrated and function automatically. By means of a specific interactive dictionary system it is possible to cipher dictionary symbols and make them visible on the screen so that consumers can clearly see parameters of a root. Rules represent a unity of products. This unity unites properly qualified markers and conditions for marker occurrence. It shows morphological categories of a verb, a noun; an interpretation of categories (in number of cases different from a traditional one), it explains synonymy and omonymy of markers; certain views are expressed about given occurrences. Namely, we believe that synonymy on a morphological level (unlike syntactic level) is limited and that is caused by morphological environment. An electronic version of a morphological glossary provides with full morphological information about Georgian verb or noun. | http://www.ice.ge/conferenciebi/Bunebriv%20enata%20damushaveba.html |
Problems of Control and Power Engineering | Tbilisi, Georgia | 2004 | 27.09.2004 – 1.10.2004 | Institute of Control Systems of the Georgian Academy of Sciences | Generative System Bfp as a Basis for Representation of Grammatical Knowledge | oral | Bfp is morphological generative system (Basic form→Paradigm), which generates for each input basic form (Bf) its whole paradigm (P). As yet it is implemented and tested on the Russian verb paradigms only. However this system is already connected with the Printing Support System (PSS) and is supposed to represent its morphological component. According to PSS restrictions it will demonstrate on the screen in consecutive order subsets of verb paradigms including less than 10 members. The hypothesis is postulated that NL mechanism may function according to the somewhat alike scheme | https://gtu.ge/msi/Pages/Conferences_2022.php |
Seventh International Tbilisi Symposium on Language, Logic and Computation | Tbilisi, Georgia | 2007 | October 1-5 | Institute for Logic, Language and Computation of the University of Amsterdam; Centre for Language, Logic and Speech at the Tbilisi State University; Georgian Academy of Sciences | Three Aspects of Language Modelling | oral | From some point of view Language Modeling (LM) can be considered as a some axis of the linguistics. The thing is that just in the frames of it the different basic components of language should be unified and as a result brought into accord and conformity. In some sense just in this their interaction in the course of LM activity shows itself their true structure, import and their natural place in a system as a whole. The typical instances of a such interaction and reciprocal influence are: dictionary and grammar, different levels of language (from phonetic-phonological up to semantic-pragmatical), analytical and synthetical direction of speech/text processing, etc. Just one more, though somewhat more global and general, dimension of such relations is here under consideration: that is, we shall here touch the question of a triple relation between aspects of language knowledge, its use and its acquisition. This direction of investigations newly began and as far only some sketches of the language knowledge/use relation are ready for demonstration, though even they are not sufficiently tested and don’t guarantee complete correctness of their functioning | https://archive.illc.uva.nl/Tbilisi/Tbilisi2007/index.php%3Fpage=15.html |
Verbal Communication Technologies | Tbilisi, Georgia | 2008 | 16-19 ოქტომბერი | Georgian Technical University | About Algorithms of Georgian Language Computer Models | oral | The computer based language research serves as means for creation of theoretical-conceptual and algorithmic-applied basis of the language. We think that our models satisfy these requirements. We have fundamentally elaborated morphologic synthesis and analysis for Georgian language, which include the quite exhaustive stem dictionary. The essential problem of semantics syntax and lexicology are also investigated. It’s worthwhile to consider some general or concrete questions, for example, such as: some suppositions about two kinds of the synonymy - arbitrary (on the syntactic level) and restricted one (on the morphologic level); we think that arbitrary synonymic is a source of language enrichment (the examples of this are paraphrases, style, poetry, ..., and some other characteristics which confirm the flexibility of language); moreover we think that homonymy performs a positive function, particularly, the language economically uses its building materials. The morphologic synthesis of Georgian word forms is computerized together with its stem dictionary. | file:///C:/Users/user/Downloads/Verbaluri_Komunikaciuri_Teqnologiebi.pdf |
Georgian Language and Modern Technologies | Tbilisi, Georgia | 2009 | 20-21 ოქტომბერი | TSU Arnold Chikobava Institute of Linguistics | Georgian Computer Prompter | oral | Georgian computer prompter is a software which can assist the disabled to write in Georgian on the computer. As is known this problem cannot yet be solved by any current software. This system suggests the correct forms of the word and makes it easy to use the keyboard with the minimum of effort. The research group at KTH (Dept. of Speech, Music and Hearing at the Royal Institute of Technology in Sweden) has been working on the assistance system for selection of the desired and correct forms for a long time. Programs that carry out this function are called word predictor “Prophet”. A word predictor suggests words whilst a person is writing, either based on the preceding word or the first letter(s) of the current word. We have decided to create a Georgian system similar to Prophet. For the Georgian version of Prophet which we call “Computer Prompter”, it is necessary to adjust the programme code of Prophet to the Georgian language; to create the Georgian text corpora (no less then a million words); to increase the database of the Georgian dictionary; to develop a morphological processor of the Georgian language and to create the dictionary of affixes and modify the dictionary database of Prophet. By this time the Georgian text corpora has been filled with about one million words. In order to create the text corpora we used Georgian internet sites. In addition, to ensure the variety of the themes, we collected and developed texts comprising twelve themes: Georgian history, Religion, Culture, Medicine, Sport, Economic, Politics, Show-business, Family, Children, People and Society. The text corpora require special processing such as the format of the entry file must have only one word in each line. As a result of the Project implementation the Georgian version of Prophet will be created. The software will assist the users to select grammatically correct words while keying in Georgian texts by means of specifying the sequence of words. The database of the dictionary will contain 100,000 basic words and all of the rules for their derivation. At present 30,000 units have been added to the dictionary of basic forms and rules. | https://www.ice.ge/new/pages/news/konferencia.pdf |
Ninth International Tbilisi Symposium on Language, Logic and Computation | Kutaisi, Georgia | 2011 | 26-30 სექტემბერი | Institute for Logic, Language and Computation (ILLC) of the University of Amsterdam, the centre for Language, Logic and Speech at the Tbilisi State University, the Georgian Academy of Sciences, Akaki Tsereteli State University | Georgian “Ancestors” of the logical implication | oral | The lion share of mental (philosophical, logical) concepts has its origin in the natural language - “body of thinking”. The most important of those concepts should have their origin in all languages, though with some possible additional nuances which make them more understandable and near for usual users of this language. Particularly, here will be given a short review of Georgian conjunctions with semantics corresponding to the logical implication. | https://archive.illc.uva.nl/Tbilisi/Tbilisi2011/Programme/Abstracts_General_Programme/index.html |
International Conference "Georgian Language and Modern Technologies" | Tbilisi, Georgia | 2011 | სსიპ არნოლდ ჩიქობავას ენათმეცნიერების ინსტიტუტი | TSU Arnold Chikobava Institute of Linguistics | Some Questions of the Formation of the Plural in the Georgian Morphological Processor | oral | The article discusses the peculiarities of declination of some georgian language nouns in plural. Linguistic events are subject to certain regularities, but besides the general rules there are exceptions. The focus is on the forming issues of some noun plural forms. According to the already established rule, the nouns of a certain group do not use plural forms. They are called uncountable nouns. Such are the nouns of substances, abstract, collective, but it is not uncommon to use such nouns in the plural during different semantic loads. Some adjectives are turned into nouns in plural forms (reds, greens, rich, poor, etc.). In the article for ilustration a lot of word combination – collocation are presented which have already been well-established in the Georgian language, such as „ქართული ღვინოები“ - Georgian wines, „მარილების დაგროვება“- salt accumulation, „მინარალური წყლები“-mineral waters - (nouns of substances); "ფიქრები”-Thoughts, „მოტივები“-Motives, „არჩევნები“-Elections - (abstract nouns); „გუნდები“-Teams, „კრებები“-Congregations - (collective nouns): and others. | http://www.ice.ge/symposium/symp2011_2/konferencia-2011.pdf |
International Scientific Conference HUMANITIES IN THE INFORMATION SOCIETY-2 | Batumi, Georgia | 2014 | 24-26 October | Batumi Shota Rustaveli State University, The Faculty of the Humanities | The syntactic structure of a Georgian sentence | oral | The syntactic structure of a Georgian sentence will be considered in the paper by the binary relation of the linguistic structure, where the role of each word in the word connection will be indicated. Syntactic relations between words in a sentence correspond to the syntactic tree structure. The members of the sentence (the words) are presented as elements of the noun phrase(s) (NP) and verb phrase(s) (VP). In order to maintain the integrity of the tree structure the concept of zero-node (S-Sentence) is used, which is a parent of the VP verb phrase in case of an impersonal verb; in other cases it is a parent of both NP and VP phrases. All members of the sentence both main and the secondary are described. The syntactic role of each is necessarily indicated: it is the syntactic role of a parent and a successor. In a parent role may be both a noun phrase and a verb phrase. These phrases can be involved in the capacity as a successor. It is also shown in which syntactic construction is one or the other word involved, also the all possible roles and corresponding grammatical features are ascribed to each of them. For example, a direct object is the successor to the VP verb phrase (VP = V + N). It can be 202 expressed by noun, adjective, numeral, pronoun, and verbal noun in singular and plural. Its cases are nominative and dative. It can been closed bi-prepositions (-vit, -tan, -ze, -ši) and particles (-a, -γa, -ve), as well as by indirect speech particles (o, metki, tko). The knowledge accumulated while the morphological analysis plays an important role in the syntactic annotation structure of a Georgian sentence. It presents comprehensive syntactic information. For example, a noun given by the ergative case may be only a subject, etc. The syntactic annotation system with such structure and its grammatical characteristics allows a perfect description of any Georgian sentence. | http://www.nplg.gov.ge/ec/ka/bibl/search.html?cmd=search&pft=biblio&qs=700%3A1%3A%E1%83%90%E1%83%9B%E1%83%98%E1%83%A0%E1%83%94%E1%83%96%E1%83%90%E1%83%A8%E1%83%95%E1%83%98%E1%83%9A%E1%83%98+%E1%83%90. |
International Scientific Conference Information and Computer Technologies. Modelling, Control Dedicated to 85th anniversary of Academician I.V. Prangishvili | Tbilisi, Georgia | 2015 | 3-5 ნოემბერი | Georgian Technical University; Georgian Engineering Academy; International Engineering Academy | WordNet თესაურუსის ტექნოლოგიის სტანდარტები | oral | The report describes the methodology of developing Georgian WordNet Thesaurus - GeWordNet. Explains the difference between traditional dictionaries and thesaurus compared to the WordNet thesaurus. Lists the basic principles used in Princeton's WordNet Thesaurus. Groups of linguistic sources necessary to present information about the language system are discussed. Characterized by WordNet Thesaurus Development Standards: Definitive, contextual, and word-production methods for meaning analysis. The types of semantic, paradigmatic, and syntagmatic connections used in Thesaurus are described. | http://ict-mc.gtu.ge/conference.pdf |
11th Tbilisi Symposium on Language, Logic and Computation, TbiLLC 2015 | Tbilisi, Georgia | 2015 | 21-26 სექტემბერი | The Centre for Language, Logic and Speech at the Tbilisi State University, the Georgian Academy of Sciences and Institute for Logic, Language and Computation (ILLC) of the University of Amsterdam | Syntax Annotation of the Georgian Literary Corpus | oral | In order to solve theoretical and applied tasks of Georgian language it is very important to draw out deeply annotated text corpora. While syntactically annotated corpora are now available for English, Czech, Russian and other languages, for Georgian they are rare. The environment, developed by our research group, offers several NLP applications, including a module of morphologic, syntactic and semantic level, a Universal Networking Language interface and a natural language interface to access SQL type databases. In this article, we research the automatic syntactic parser of Georgian Language. It includes syntactic level as well as morphologic level of Georgian language model. The basis of the linguistic model of Georgian text syntax annotation is the dependency grammar. | https://archive.illc.uva.nl/Tbilisi/Tbilisi2015/Accepted-abstracts/index.html |
International Conference Language and Modern Technologies − 2015 | Tbilisi | 2015 | 10-15 სექტემბერი | Ivane Javakhishvili Tbilisi State University; Goethe University Frankfurt an Main; Arnold Chikobava Institute of Linguistics | Computer Models of the Georgian language | oral | Fundamental scientific researches in Computational Linguistics have been carried out at the Archil Eliashvili Institute of Control Systems, Georgian Technical University, for many years. Various combination methods (lexical functions, synonymous series, semantic roles and superparadigms) were developed for the Georgian language; A computer dictionary of the Georgian language has been created, which, at the same time, carries out functions of a morphological generator, in other words, it produces the full paradigm for each lexical unit. Within the framework of the fundamental issues of language modeling, a means of presenting linguistic algorithms has been developed, that allows the formulation of a bi-directional analysis-combined processor. For some languages, the filling-widening process of dictionaries has been simplified with the help of a grammar compiler, which is the most modern tool for the automatic realization of a formal language model. It is possible to compile morphological processor libraries of individual languages for different variations of any language (according to time, space, origin, genre, etc.) and so on. Automatic machine translation can be considered as the main achievement. To solve this task completely it is necessary to create lexical translator. This type of system is rather valued among the ordinary users, as it makes it easier for them to learn foreign-language texts intensively thus is much more useful while composing text.The strategy of our team is to provide reliable support for future language technologies by the theoretical and practical key issues that have been worked up in separate projects for the past years. The computer products created by our team are used in various linguistic areas. It is a challenge for linguists to create computational model of a language, taking into account its multi detections and changes. We have created main components to compile a national corpus manager. | https://ice.ge/of/wp-content/uploads/symp2015/konferenciis%20masala.pdf |
VII International Scientific Conference in Semiotics, Chaos and Cosmos Semiotics | Batumi, Georgia | 2016 | 21-23 ოქტომბერი | Research Center of Semiotics at the Ilia State University; Faculty of humaties of the Batumi Shota rustaveli State University | Automatic formation of a hyponym tree in Georgian WordNet. | oral | WordNet is the most widely used lexical database in the field of modern information technology. It was based on the 1996 model of the human mental dictionary developed at the Princeton University Laboratory of Cognitive Sciences, which eventually became the most authoritative and widely used standard for building lexical-semantic databases. WordNet lexical-semantic thesaurus knowledge bases are used in tasks such as information retrieval, machine translation, word meaning determination, and dialog systems. EuroWordNet was created in 1999, combining WordNet dictionaries of European languages. Currently, the Department of Linguistic and Speech Systems of the Institute of Control Systems is working on the creation of Georgian WordNet (Shota Rustaveli National Science Foundation grant "Georgian Word Network Compiler GeWordNet"). The basic building block of GeWordNet is a set of synonyms (synset) that combine words with similar meanings. . It is assumed that each synset in the dictionary is a lexicalized concept of a given language. To make the dictionary easier to use, each synset is accompanied by a definition and contextual examples of synset words. In GeWordNet, synsets are interconnected by semantic relations - hyponymy, meronymy, lexical causation, presupposition, and more. Georgian WordNet is developed in two stages. First, a WordNet dictionary is created for the Georgian language, and in the second stage, the Georgian dictionary is linked to EuroWordNet through the Georgian-English interlingual index (ILI). Different levels of its equivalence with ILI index synsets can be used to describe specific language synsets: EQ_SYNONYM - Complete match between ILI index synset and language synset; EQ_NEAR_SYNONYM - synset of the target language corresponds to the synset of several ILI indexes; HAS_EQ_HYPERONYM - target language synset is more specific than index synset; HAS_EQ_HYPONYM Language synset can only communicate with index-specific synset. Georgian WordNet synsets are automatically formed using a bilingual electronic dictionary. | https://iliauni.edu.ge/uploads/other/38/38129.pdf |
XII Tbilisi Symposium of Language, Logic and Computation | Lagodekhi, Georgia | 2017 | 18-22 სექტემბერი | The Centre for Language, Logic and Speech at the Tbilisi State University, the Georgian Academy of Sciences and Institute for Logic, Language and Computation (ILLC) of the University of Amsterdam. | Derivation Models According to Otar Tchiladze Text Corpus | oral | The paper presents the peculiarities of derivation in the novels of the Georgian writer Otar Tchiladze. There are shown some problems of word production, also some ways and forms of using word-formation. The elements that take part in word formation differ in semantics and activeness. Therefore, it is more convenient to consider not separate derivative elements but the models that include these elements. Productive word-formation is the topical one today. “Word-formation” has broad meaning in Georgian. It also means derivation that on its turn means creating new words not only by affixation but also by composition. Learning productive means of word-formation helps to develop word-formation process in a language. In the process of composing the rules and means of word-formation differ in activeness. Derivative models are based on text corpus according to Otar Tchiladze novels. It is well known that every writer has his own style of writing. Some language elements of style of one writer may coincide with the style of the other writer. However, the structure and the speech order would be different. The paper presents the full model of word-formation and statistical data according to the corpus of the novels by Otar Tchiladze. There will also be shown analysis of the words that is characteristic to the writer only. | https://archive.illc.uva.nl/Tbilisi/Tbilisi2017/uploaded_files/inlineitem/TbiLLC2017_booklet_final.pdf#page=100 |
3rd International conference "Current issues in applied Linguistics | Baku, Azerbaijan | 2018 | 25-26 ოქტომბერი | Azerbaijan University of Languages | Georgian-English Bidirectional Automatic Translation of Derivational Forms | oral | During translation, derivative words (derivatives) complicate the situation. The problem is in derivational forms, since the addition of word-forming affixes causes linguistic changes in words. Some affixes are synonymous, others are homonymous, and it is very important to solve this problem in the process of building computer models. It is necessary to pay special attention to phonetic phenomenon and root changes when creating translation algorithms and further programming. To solve this problem, first of all, we created a database of word-building affixes of the Georgian language. It combines morphemes that are native to the modern literary Georgian language or introduced from other languages. For the automatic formation of the corresponding Georgian-English derivational forms, models of word formations of different semantic groups were created for both languages. A database of English derivational affixes was also created. The Georgian language is known for an abundance of morphological forms. For perfect automatic translation, a morphological processor of the Georgian language was connected to the system. Thus, automatic lemmatization of Georgian words is performed, after which algorithms for derivation model recognition and automatic translation are used. The same process, only in reverse, is used for the English derivational form of the word. In both cases, we get one or several (in the case of homonymy in the original form and synonymy in the final form) lemma of the derivational form of the word. | https://muhaz.org/to-azerbaijan.html?page=3 |
2nd International Conference on Research in Social Sciences | Vienna, Austria | 2019 | 3-5 დეკემბერი | Webster Vienna Private University | The Translation Model Based on Sentential Primitives | oral | The main problem of language modelling is the meaning of the utterance, which is the ultimate goal of one of the directions of language model functioning (analysis) and the starting point for its opposite direction (synthesis). The paper suggests one of the options of the problem solution. In particular, dividing sentences into the set of (quasi)-synonymous “sentential primitives" and on the basis of the "primitives", building such a structure, which reflects the appropriate content of this set quite transparently. Determination of the estimated set of primitives should be based on the search of the constituents according to their characteristics, particularly, to their syntactic properties. Verbs and their dominant role in sentences are considered according to "multilevel" syntax. By this theory, a primitive is a central component (core) of the layered structure, which is "surrounded" by the periphery i.e. traditional adverbs. The central component itself involves the verb and its actants. At the same time, internal relations of primitives, i.e. dependencies of actants on the predicates and peripheries on central structures are defined by the semantic roles. Such approach to language content conveys the meaning of the utterance through the sentential primitives where the primitives are the language units. In the frame of the presentation, we will offer a translation model, which is simplified to the level of primitives. For testing the model, Georgian-English parallel corpus will be used. | |