Standardization of Dialectal Data and Information: Necessity and Solution مقالة

عنوان اصلی: استانداردسازی داده‌ها و اطلاعات گویشی: ضرورت و راهکار

پژوهش های زبان شناسی تطبیقی پاییز و زمستان 1402 - شماره 26 التصنيف: ب/ISC (‎19 صفحة - من 1 إلی 19 )

الکلمات المفتاحية: standardization dialectology Computational Dialectology Markup Language Data Organization eXtensible Markup Language Data Information

خلاصة:

Dialectology studies a dialect scientifically along with its geographical distribution.Each dialect is a language; and to study a dialect various linguistic analyses are required.This property makes the study of a language a little long in terms of time.Collecting dialectical data is very time consuming and required a lot of effort.Raw data is not much usable in dialectology and it is required to add linguistic analyses to the data in the framework of structural linguistic analysis.Using a computer as a research tool causes to prepare the data in a specific structure.The main contribution of the current paper is proposing a standard to organize dialectic data and information.This standard contains the dialectic data, its relevant meta-data, and the linguistic information related to the analysis of this data.The meta-data and linguistic information are organized in the XML tree structure.This data structure is highly portable and it can be easily read into a database.

ملخص الجهاز:

Thirteenth Year - Number 26 - Autumn and Winter 2023-2024 Standardization of Dialectal Data and Information: Necessity and Solution Masoud Ghayoomi 1 Research Article Abstract Dialectology deals with the scientific study of a dialect and its geographical distribution. This information is organized based on a tree data structure and an extensible markup language. Dalkir Journal of Comparative Linguistic Research Thirteenth Year - Number 26 - Autumn and Winter 2023 5 The invention of the computer caused a distinction to be created between the concepts of data, information, and knowledge. In the field of Natural Language Processing, input data must be structured based on a specific framework; in this regard, the 'Lexicon Markup Framework,' which has been introduced by the International Organization for Standardization and deals with the management of linguistic resources, has been accepted. He has introduced the method of organizing data by utilizing eXtensible Markup Language and the 'Conference on Computational Natural Language Learning' (CoNLL) structure (Buchholz and Marsi, 2006). The data structure used in this series of workshops is based on eXtensible Markup Language, such that the meaning of the target word is defined as an attribute and a value in each tree node based on eXtensible Markup Language. In this article, a specific structure in the form of a standard is presented for organizing metadata and cognitive linguistic analyses of speech data at various levels, including phonology, morphology, syntax, semantics, and discourse analysis. Various types of cognitive linguistic information are defined with an attribute and value structure in each node, which will be explained below.

استلام ملف الإرجاع :
(پژوهیار, , , )

رابط قصير:

دخول / الاشتراک

تحتاج الدخول لعرض محتوى المقالة. إذا لم تكن عضوًا ، فتابع من الجزء الاشتراک.

دخول

الاشتراک

تحتاج دخول لعرض محتوى المقالة. إذا لم تكن عضوًا ، فتابع من الجزء الاشتراک.
إن كنت لا تقدر علی شراء الاشتراك عبرPayPal أو بطاقة VISA، الرجاء ارسال رقم هاتفك المحمول إلی مدير الموقع عبر webmaster@noormags.com .

You need Sign in to view the content of the article. If you are not a member, proceed from part Sign up.
If you fail to purchase subscription via PayPal or VISA Card, please send your mobile number to the Website Administrator via webmaster@noormags.com .

رابط قصير:

1404

1403

1402

1401

1400

1399

1398

1397

1396

1395

1394

1393

1392

1391

1388

1387

Standardization of Dialectal Data and Information: Necessity and Solution مقالة