跳转到内容

Shidinn PUA Encoding

此后如竟没有炬火,我便是唯一的光。
This artical was translated automatically using Google Translate (or similar) and may have been manually improved. Please be aware that unless it has been reviewed by someone who speaks the original language of the page, then it may contain errors, including incorrect information not present in the original version of the page.


Shidinn PUA encoding is an encoding system developed by the Shidinn Standardization Committee, which contains Hanzi used for Shidinn, Shidinn letters, and Infinite Shidinn letters. Currently, the Shidinn Wiki uses it as the internal code for display and data in Shidinn. It uses Private Use Area (PUA) characters in Unicode.

It replaces the Shidinn coordinated character set that is still in full (???), and the new coded 0th code area (including basic Shidinn letters, Hanzi used in Shidinn, and other characters), and will be compatible with the original encoding system. An automatic converter will be provided for characters outside it.

Standardization Plan

What is Shown Characters Included in the PUA Code
Character Undefined
Chat Alphabet %1% %2% %3% %4% %5% %6% %7% %8% %9% %0% %T% %.% N_ iu_ ui_
Codepoint E001 E002 E003 E004 E005 E006 E007 E008 E009 E00A E00B E00C E00D E00E E00F
Character Variable Selector
Chat Alphabet aho aho aho tkY afa afa atiY atiY lhu8 5Lu uNV
Codepoint E015 E016 E017 E018 E019 E01A E01B E01C E01D E01E E01F
Character
Chat Alphabet ⇧!_ ⇧b ⇧p ⇧m ⇧w ⇧j ⇧q ⇧x ⇧y ⇧n ⇧z ⇧D ⇧s ⇧r ⇧H ⇧N
Codepoint E020 E021 E022 E023 E024 E025 E026 E027 E028 E029 E02A E02B E02C E02D E02E E02F
Character
Chat Alphabet !_ b p m w j q x y n z D s r H N
Codepoint E030 E031 E032 E033 E034 E035 E036 E037 E038 E039 E03A E03B E03C E03D E03E E03F
Character
Chat Alphabet ~!_ ~b ~p ~m ~w ~j ~q ~x ~y ~n ~z ~D ~s ~r ~H ~N
Codepoint E040 E041 E042 E043 E044 E045 E046 E047 E048 E049 E04A E04B E04C E04D E04E E04F
Character
Chat Alphabet ⇧l ⇧d ⇧t ⇧g ⇧k ⇧h ⇧4 ⇧5 ⇧v ⇧F ⇧7 ⇧B ⇧c ⇧f ⇧u ⇧a
Codepoint E050 E051 E052 E053 E054 E055 E056 E057 E058 E059 E05A E05B E05C E05D E05E E05F
Character
Chat Alphabet l d t g k h 4 5 v F 7 B c f u a
Codepoint E060 E061 E062 E063 E064 E065 E066 E067 E068 E069 E06A E06B E06C E06D E06E E06F
Character
Chat Alphabet ~l ~d ~t ~g ~k ~h ~4 ~5 ~v ~F ~7 ~B ~c ~f ~u ~a
Codepoint E070 E071 E072 E073 E074 E075 E076 E077 E078 E079 E07A E07B E07C E07D E07E E07F
Character 希扩字母 script letters, Infinite
Shidinn letters, and more...
Codepoints for these letters.
Chat Alphabet ⇧o ⇧e ⇧E ⇧A ⇧Y ⇧L ⇧6 ⇧2 ⇧T ⇧8 ⇧3 ⇧V ⇧1 ⇧i
Codepoint E080 E081 E082 E083 E084 E085 E086 E087 E088 E089 E08A E08B E08C E08D
Character
Chat Alphabet o e E A Y L 6 2 T 8 3 V 1 i
Codepoint E090 E091 E092 E093 E094 E095 E096 E097 E098 E099 E09A E09B E09C E09D
Character
Chat Alphabet ~o ~e ~E ~A ~Y ~L ~6 ~2 ~T ~8 ~3 ~V ~1 ~i
Codepoint E0A0 E0A1 E0A2 E0A3 E0A4 E0A5 E0A6 E0A7 E0A8 E0A9 E0AA E0AB E0AC E0AD

Notes:

  • The letter part follows the formula "Letter code = letter number + letter number >>4<<5+Writing type<<4+0xe020". For those not included in the table, you can calculate the code bits by yourself according to the formula.
  • The undefined part of the table may be defined in the future, and the processing method after the code bit exceeds the first private use area is also uncertain.
  • The character ⇧ is the character (U+21E7) in Unicode, which is a special rendering effect provided by the font(s) used on the Shidinn Wiki. (In the font(s), it looks like a small lowered version of the Hanzi meaning "big".)
    • Because it is inconvenient to enter, some people use ^ instead.

Input Methods

  • In the new wiki, you can use the {{X}} template to automatically convert chat alphabet to Shidinn PUA encoding.
  • You can use the conversion function provided by Matling's translator.
  • Use a universal character inputter such as unicodepad to enter the corresponding PUA characters.
  • Try to write the corresponding input tool software.

Supported Fonts

Currently, there are many Shidinn fonts based on Shidinn PUA encoding.

Problems

  • Some community members are more accustomed to entering Shidinn text using fonts based on chat alphabet.
  • Loading is prone to problems, and some software (based on ANSI encoding) cannot process PUA characters.
  • The automated input device is temporarily missing. (Pengpeng Miao has developed this input method.)
  • There are fewer fonts that fully support PUA encoding.
  • A separate rendering tool and font can be displayed for reading. If used in general software such as Discord or system interfaces (probably depending on the operating system if a compatible font is installed), tofu blocks will be displayed and the text cannot be read.
  • Conflicts occur when using simultaneously with other fonts or software that uses the 0xE000-0xE19F range of Private Use Area characters.
  • Huáng Quèfēi is not very used to this solution and it is not easy to install the corresponding font support.

See also

In other languages