- Different languages, similar encoding efficiency Comparable information rates across the human communicative niche
- Speakers of different languages remember visual scenes differently
- Syntactic Structures (1957) monograph by Noam Chomsky and his first book on linguistics
- I used and cited this in my extended essay (submitted for finals in 2015)
- History of the International Phonetic Alphabet - Wikipedia
- Word-prosodic typology
- What is chuchotage? - French term for a type of interpreting, specifically an interpreting technique that involves whispering the translation into the listenerâs ear. Indeed, the term comes from the French word âchuchoterâ, meaning âto whisperâ
- Purple prose - Wikipedia as mentioned as a criticism of modern day Large Language Models in AI-Slop to AI-Polish Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
- Generative grammar - Wikipedia
Resources đ
- Subword Modeling course (link) by David R. Mortensen
Useful resources signposted in the supplementary materials of Different languages, similar encoding efficiency Comparable information rates across the human communicative niche
Table S1. The 17 languages used in this study. For each language we give its common name, its ISO 639-3 (https://iso639-3.sil.org), Glottolog (https://glottolog.org) and WALS (https://wals.info) codes, its language family and subfamily (as given by WALS), and its number of distinct syllables and syllable structure (classification and numeric code) as given by LAPSyD (http://www.lapsyd.ddl.cnrs.fr/lapsyd). The table cells are hyperlinks to the language entries in PHOIBLE (column Language names; https://phoible.org), the Ethnologue (column ISO 639-3 code; https://www.ethnologue.com), Glottolog (column Glottolog code; https://glottolog.org), WALS (column WALS code; https://wals.info) and LAPSyD (column Syllable structure; http://www.lapsyd.ddl.cnrs.fr/lapsyd); these links give quick access to a wealth of information concerning these languages including their genealogical classification, geographic location and structural properties ranging from phonetics to semantics. We use the ISO 639-3 codes throughout the analysis and results, and this table is sorted alphabetically by these codes. The numeric code for the complexity of syllable structure is based on (53) and is computed from the 3-way classification of languageâs maximal syllable structure in WALS: either (i) simple [i.e., (C)V], (ii) moderately complex [i.e., (C)(C)V(C)], or (iii) complex [i.e., (C)(C)(C)V(C)(C)(C)(C)]. The numeric code is the sum of the maximum number of consonants in onset and coda, and the number of vowels per syllable (for example, the maximal syllable structure of English is (C)(C)(C)V(C)(C) (C)(C), resulting in 3 + 4 + 1 = 8). For more details on methodology please see (45). Notes: * The data for Cantonese is currently not accessible through the online interface to LAPSyD. § Currently Serbian (or the closely related Croatian [used for PHOIBLE]) is not included in the LAPSyD database, but its complexity was estimated by YO following the same method as for the rest of the LAPSyD.
â Supplementary Materials - Different languages, similar encoding efficiency Comparable information rates across the human communicative niche
Concepts
- anaphora
- selection
- Morphological typology - Wikipedia
- Pitch-accent language - Wikipedia
- Grammelot - Wikipedia
- nice example (some actual English creeps in)
- Transcreation - Wikipedia