Wikidata talk:Lexicographical data

From Wikidata
(Redirected from Wikidata talk:Wiktionary)
Jump to navigation Jump to search
Lexicographical data
Place used to discuss any and all aspects of lexicographical data: the project itself, policy and proposals, individual lexicographical items, technical issues, etc.
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2024/08.


snowclones in scope?

[edit]

the snowclone (Q2338287) being a phrasal template (src: https://snowclones.org/about) like "old X never die, they just Y"; the use of X, Y, Z, A, etc. seems to be the consensus way to notate the 'blanks' of the template, but it strikes me as rather ad-hoc when we're considering a semantic database that might be able to represent that concept of replaceability with better fidelity. Arlo Barnes (talk) 06:28, 17 May 2024 (UTC)[reply]

@Arlo Barnes: good point, not exactly sure how to deal with it ; especially as lemmas can contain X as a letter in itself and not as a placeholder (like in "X marks the spot" or "fragile X syndrome"). One way could be to explicitly put a placeholder in combines lexemes (P5238). PS: it's not only for snowclones, many phrase are in a similar case. Cheers, VIGNERON (talk) 14:19, 15 June 2024 (UTC)[reply]

Lexicodays, online event dedicated to Lexicographical Data, on June 28-30, 2024

[edit]

- Indonesian version below -

Hello all,

Have you ever wondered how Wikidata stores and models words? How to create and improve Lexemes in your languages? Or even why it is useful and which projects could benefit from it?

The Lexicodays 2024 will answer these questions, and many more. During this online event, you will be able to learn more about Lexicographical Data on Wikidata, to discover how to model words in your languages, and to try out various tools that make it easier to work on Lexemes. It offers a space for editors involved in creating and maintaining Lexemes to discuss their ideas, challenges and best practices.

The online event will take place on June 28, 29 and 30, with sessions replicated in different languages and at different times across time zones. It is co-organized by Wikimedia Deutschland and the Software Collaboration Team in Indonesia, and we will focus on the languages of Indonesia and the Wikidata community in Indonesia. The event is open to everyone regardless of their knowledge of Lexemes. Most sessions will be recorded and published after the event.

On the main event page, you can discover the structure of the program, which will keep evolving in the upcoming weeks. We are also welcoming proposals for the program until June 20th - we are particularly interested in introductions to Lexicographical Data in different languages, and discussions run by community members on how to improve modelling and documentation in a specific language.

We will launch registration for the event in the upcoming days - if you’re interested, stay tuned by following the talk page or joining the Lexicographical Data Telegram group.

If you have any questions, feel free to write on the talk page of the event. See you soon, Léa (Lea Lacroix (WMDE)) and Raisha (Fexpr).

---

Halo, teman-teman!

Pernahkah Anda bertanya-tanya bagaimana Wikidata menyimpan dan memodelkan kata-kata? Bagaimana cara membuat dan meningkatkan Leksem dalam bahasa yang Anda tuturkan? Kenapa Leksem itu bermanfaat? Proyek-proyek apa yang akan terbantu dengan adanya Leksem ini?

Lexicodays 2024 akan menjawab pertanyaan-pertanyaan tersebut, dan masih banyak lagi. Selama acara daring ini, Anda akan dapat mempelajari lebih lanjut mengenai Data Leksikografis di Wikidata, menemukan cara memodelkan kata-kata dalam bahasa Anda, dan mencoba berbagai perkakas yang memudahkan Anda dalam menyunting Leksem. Acara ini membuka ruang bagi para penyunting yang terlibat dalam pembuatan dan pemeliharaan Leksem untuk saling berdiskusi mengenai ide, tantangan, maupun praktik-praktik terbaik.

Acara daring ini akan berlangsung pada tanggal 28, 29, dan 30 Juni, dengan waktu penyelenggaraan yang tersebar dalam beberapa zona waktu dan sesi-sesi serupa yang diantarkan dalam bahasa-bahasa yang berbeda. Acara ini diselenggarakan bersama oleh Wikimedia Deutschland dan Tim Kolaborasi Perangkat Lunak di Indonesia. Fokus dari acara ini adalah untuk bahasa-bahasa yang dituturkan di Indonesia dan komunitas Wikidata di Indonesia. Acara ini terbuka untuk siapa saja, terlepas dari seberapa akrab Anda dengan Leksem. Kami akan merekam sebagian besar sesi dan mempublikasikannya setelah acara selesai.

Anda dapat mengakses jadwal kegiatan pada halaman beranda acara, yang akan terus kami perbarui dalam beberapa pekan ke depan. Kami juga mengadakan panggilan terbuka untuk pengajuan proposal kegiatan hingga tanggal 20 Juni. Kami sangat tertarik dengan pengenalan Data Leksikografis dalam berbagai bahasa, dan diskusi yang dilakukan oleh anggota komunitas mengenai cara meningkatkan pemodelan dan dokumentasi dalam bahasa tertentu.

Kami akan membuka pendaftaran untuk acara ini dalam beberapa hari mendatang. Apabila Anda tertarik, silakan pantau terus laman pembicaraan ini atau bergabunglah dengan grup Telegram Data Leksikografis.

Jika Anda memiliki pertanyaan, jangan ragu untuk menulis di laman pembicaraan acara Lexicodays 2024. Sampai jumpa, Léa Lea Lacroix (WMDE) dan Raisha Fexpr. Lea Lacroix (WMDE) (talk) 09:00, 3 June 2024 (UTC)[reply]

Hello all,
As a reminder, the Lexicodays 2024, online event dedicated to Lexicographical Data on Wikidata, will take place on June 28, 29 and 30, with sessions replicated in different languages and at different times across time zones.
The event will take place both on Zoom and Jitsi, and the access will be free without registration (the access links will be added to the program page). However, if you’re planning to join, we invite you to add your username to the Participants page.
We also remind you that you can contribute to the program until June 20th by adding a proposal to the talk page. You’ll find more information here.
We are particularly interested in introductions to Lexicographical Data in different languages, and discussions run by community members on how to improve modelling and documentation in a specific language. You can also present tools or Lexeme usecases.
If you have any questions, feel free to reach out to Léa (Lea Lacroix (WMDE)) or Raisha (Raisha (WSC)).
We’re looking forward to seeing you at the Lexicodays! Lea Lacroix (WMDE) (talk) 10:28, 18 June 2024 (UTC)[reply]
Hello all,
The Lexicodays 2024 will take place this week, on June 28, 29 and 30!.
The event will take place both on Zoom and Jitsi, and the access will be free without registration (the access links will be added to the program page). However, if you’re planning to join, we invite you to add your username to the Participants page. The event will include sessions replicated in different languages and at different times across time zones.
Here are a few interesting sessions that you will find in the program:
  • Introduction to Lexicographical data and how to model words in Wikidata
  • Discussions about modelling proverbs, sayings, compound words and predicates
  • Presentation of some useful tools
  • Modelling sessions and editathons in various languages of Indonesia
  • Introduction to Abstract Wikipedia and how it will work together with Lexemes
  • Exploring how to generate sentences with Lexemes
Note that most sessions will be recorded and available after the event.
If you have any questions, feel free to reach out to Léa (Lea Lacroix (WMDE)) or Raisha (Raisha (WSC)).
We’re looking forward to seeing you at the Lexicodays! Lea Lacroix (WMDE) (talk) 10:20, 24 June 2024 (UTC)[reply]

Moving a statement for irregular verbs?

[edit]

Hi,

I noticed that around 1600 lexemes for verbs use instance of (P31)irregular verb (Q70235) (https://w.wiki/AX5f) where I would have rather used the more specific conjugation class (P5186)irregular verb (Q70235). What do you think, should we move it or not? and if so, does someone have a bot to move them?

Cheers, VIGNERON (talk) 12:24, 29 June 2024 (UTC)[reply]

irregular verb (Q70235) doesn't look like specific conjugation class (Q53996674). --Infovarius (talk) 20:05, 30 June 2024 (UTC)[reply]
I don't think conjugation class (P5186) would be right, an irregular verb is a verb where conjugation behaves irregularly in some way:
- The conjugation class is a property of a verb, so I don't think its values should be subclasses of verb (Q24905) (like how the gender of a masculine noun is "masculine", not "masculine noun").
- "irregular" isn't a specific conjugation class. The verb might still have a conjugation class but be irregular because it has one or more irregular forms. It might be irregular because it follows one conjugation class for some forms and another for the rest.
- Nikki (talk) 06:27, 1 July 2024 (UTC)[reply]
@Infovarius, Nikki: indeed, I'm not entirely convinced the current situation is right (especially as it's quite inconsistent) but my proposal is clearly not right either. I'll leave it (as least for now and until I have a better idea). Cheers, VIGNERON (talk) 15:37, 8 July 2024 (UTC)[reply]

Should pa'al (Q7265893) and Fa3aL (Q114419665) (etc.) be linked in some way? My knowledge of Hebrew is very rudimentary and I haven't looked into the details, but these kind of pairs seem to be related. Disclosure: I have created the items like Fa3aL (Q114419665) and started to use them for Arabic varieties in statements like كَتَب (L1331764-F1)uses (P2283)Fa3aL (Q114419665). --Marsupium (talk) 21:54, 1 July 2024 (UTC)[reply]

Proposal to bridge the gap between items and lexicographical data

[edit]

The idea is to link Item labels, aliases and monolingual text with a corresponding lexeme. You can read more about it and post your comments on the phabricator ticket. 5628785a (talk) 21:44, 3 July 2024 (UTC)[reply]

Interesting and good idea. This bridge is not easy to cross (for many reason, including but not limited to homograph) but indeed more tool would make it easier. Cheers, VIGNERON (talk) 08:43, 5 July 2024 (UTC)[reply]

Oxford dictionaries

[edit]

Hi y'all,

Soufiyouns made 6 property proposal for Oxford dictionaries. Right now, the proposal are strangely going in various direction and I thought it might be useful to centralize the talk here.

First, the proposal were strangely on Wikidata:Property_proposal/Authority_control instead of Wikidata:Property proposal/Lexemes (I fixed that but some people may have missed some of the proposals, pages where there is already only low participation in normal times).

Mahir256 had a very interesting question on the 3 English dictionaries: « That's enough with the English dictionaries, don't you think? » I'm wondering, how many is "enough"? I don't think it's just a question of number, but also of content. Sadly the examples given are about very common words and the motivation is short, so its indeed hard to see what te value of these identifiers. For example, is there words that can only be found in these dictionary that would make them unique? Also, are they really "Highly authoritative source"? (just because it's published by the prestigious Oxford don't make them magically great ; I add a quick look at the freely accessible part of https://www.oxfordreference.com/display/10.1093/acref/9780191739545.001.0001/acref-9780191739545 and I may have missed something but it's not really impressive).

Then for me, the main problem is that these website are not fully freely accessible. If the value was clear, maybe we could overlook this but the two points together makes me wonder... the main issue is for homograph, without seeing the content, how can I know what lexeme are https://www.oxfordreference.com/display/10.1093/acref/9780191739545.001.0001/b-fr-en-00003-0000001 and https://www.oxfordreference.com/display/10.1093/acref/9780191739545.001.0001/b-fr-en-00003-0000002 (both "a").

What do you all think?

Cheers, VIGNERON (talk) 15:55, 8 July 2024 (UTC)[reply]

@VIGNERON Note that while ordinarily I would not encourage the addition of entirely paywalled reference properties, Oxford Reference is freely accessible to Wikimedia users through Wikipedia Library. I am not opposed to the addition of any of these properties for the facts that bilingual dictionary properties ultimately reduce the amount of time required of non-English speaking contributors to add sourced information to lexemes. However, given the number of existing properties for English, German, and French lexemes in particular those proposals are not a priority from my point of view. عُثمان (talk) 16:33, 8 July 2024 (UTC)[reply]


Hi @VIGNERON, thanks for opening the discussion. I agree with the various points made.
I supported the creation of the English-Italian dictionary property. The Italian language still needs references and identifiers to support it. Especially with regard to the current Italian language.
Searching for a word in Google, moreover, the 'dictionary box' at the top of the page presents results from Oxford Languages.
Regarding all the other properties of Oxford dictionaries, I too am not convinced by the quantity. It is easy for them to remain dormant properties.
Therefore, I will not vote for other properties that do not interest me.
Thanks, Luca.favorido (talk) 05:20, 9 July 2024 (UTC)[reply]