fxomt's on dbzer0@lemm.ee to Data is Beautiful@mander.xyzEnglish · edit-21 month agoA representation of words derived from Latinfiles.catbox.moeimagemessage-square46fedilinkarrow-up1238arrow-down15file-text
arrow-up1233arrow-down1imageA representation of words derived from Latinfiles.catbox.moefxomt's on dbzer0@lemm.ee to Data is Beautiful@mander.xyzEnglish · edit-21 month agomessage-square46fedilinkfile-text
Cross posted from: Latin@lemm.ee lingua latina pater linguarum dimidum est 😎 I hope it’s okay for me to crosspost here.
minus-squareHackworth@lemmy.worldlinkfedilinkEnglisharrow-up5·1 month agoI wonder if something like the semantic tokenization method would benefit from using etymological data like this, particularly for a multilingual llm.
minus-squaregandalf_der_12te@discuss.tchncs.delinkfedilinkEnglisharrow-up3·edit-21 month agoi know that my NN internally uses semantic tokenization method. i literally often seek the word roots when talking to somebody. it helps me focus.
minus-squarefxomt's on dbzer0@lemm.eeOPlinkfedilinkEnglisharrow-up2·1 month agoInteresting paper, thanks for sharing
I wonder if something like the semantic tokenization method would benefit from using etymological data like this, particularly for a multilingual llm.
i know that my NN internally uses semantic tokenization method.
i literally often seek the word roots when talking to somebody. it helps me focus.
Interesting paper, thanks for sharing