-
-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Labels
Description
U+1160..U+11FF
and U+D7B0..U+D7FF
should have 0 width.
Korean Hangul is a writing system which uses syllable blocks consisting of alphabetic components. A syllable consists of one or more Leading Consonants, one or more Vowels, and zero or more trailing consonants.
Unicode has precomposed syllable blocks at U+AC00..U+D7A3
(11172).
There are also component Jamos:
- Hangul Jamo (
U+1100..U+11FF
).U+1100..U+115F
Choseong (initial, Leading Consonants) have East_Asian_Width=Wide and Hangul_Syllable_Type=Leading_JamoU+1160..U+11A7
Jungseong (medial, Vowels) have East_Asian_Width=Neutral and Hangul_Syllable_Type=Vowel_JamoU+11A8..U+11FF
Jongseong (final, Trailing consonants) have East_Asian_Width=Neutral and Hangul_Syllable_Type=Trailing_Jamo
U+A960..U+A97F
Hangul Jamo Extended-A (choseong) have East_Asian_Width=WideU+D7B0..U+D7FF
Hangul Jamo Extended-B (jungseong and jongseong) have East_Asian_Width=NeutralU+3130..U+318F
Hangul Compatibility Jamo have no conjoining behaviorU+FFA0..U+FFDF
half-width forms have no conjoining behavior.
U+1100..U+11FF
, U+A960..U+A97F
, U+D7B0..U+D7FF
have conjoining behavior, a sequence of L+V+T* gets rendered as a syllable block. wcwidth()
implementations tend to give U+1100..U+115F
width 2, and U+1160..U+11FF
width 0, so the resulting syllable block has the correct total width.
U+D7B0..U+D7FF
, should also have width 0.
glibc gave width 0 to conjoining jungseong and jongseong at:
commit 7a79e321c6f85b204036c33d85f6b2aa794e7c76
Author: Thorsten Glaser <[email protected]>
Date: Fri Jul 14 14:02:50 2017 +0200
Refresh generated charmap data and ChangeLog
[BZ #21750]
* charmaps/UTF-8: Refresh.
diff --git a/localedata/ChangeLog b/localedata/ChangeLog
index 04ef5ad071..9e05b4a652 100644
--- a/localedata/ChangeLog
+++ b/localedata/ChangeLog
@@ -1,3 +1,17 @@
+2017-07-14 Thorsten Glaser <[email protected]>
+
+ [BZ #21750]
+ * charmaps/UTF-8: Refresh.
+ * unicode-gen/utf8_gen.py (U+00AD): Set width to 1.
+ * unicode-gen/utf8_gen.py (U+1160..U+11FF): Set width to 0.
+ * unicode-gen/utf8_gen.py (U+3248..U+324F): Set width to 2.
+ * unicode-gen/utf8_gen.py (U+4DC0..U+4DFF): Likewise.
+ * unicode-gen/utf8_gen.py: Treat category Me and Mn as combining.
+ [BZ #19852]
+ * unicode-gen/utf8_gen.py: Process EastAsianWidth lines before
+ UnicodeData lines so the latter have precedence; remove hack
+ to group output by EastAsianWidth ranges.
+
[ ... snip ...]
commit 6e540caa21616d5ec5511fafb22819204525138e
Author: Mike FABIAN <[email protected]>
Date: Tue Jun 16 08:29:40 2020 +0200
Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120]
Reviewed-by: default avatarCarlos O'Donell <[email protected]>
diff --git a/localedata/charmaps/UTF-8 b/localedata/charmaps/UTF-8
index 14c5d4fa33..8cce47cd97 100644
--- a/localedata/charmaps/UTF-8
+++ b/localedata/charmaps/UTF-8
@@ -48920,6 +48920,8 @@ WIDTH
<UABE8> 0
<UABED> 0
<UAC00>...<UD7A3> 2
+<UD7B0>...<UD7C6> 0
+<UD7CB>...<UD7FB> 0
<UF900>...<UFA6D> 2
<UFA70>...<UFAD9> 2
<UFB1E> 0