| Vocabulary Broker - Proof of Concept

Unicode

prefLabel	Unicode
definition	Text in multi-byte Unicode format.
inScheme	SPASE Format SPASE Encoding
broader	Text
Abstract from DBPedia	Unicode, formally The Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, emoji (including in colors), and non-visual control and formatting codes. Unicode's success at unifying character sets has led to its widespread and predominant use in the internationalization and localization of computer software. The standard has been implemented in many recent technologies, including modern operating systems, XML, and most modern programming languages. The Unicode character repertoire is synchronized with ISO/IEC 10646, each being code-for-code identical with the other. The Unicode Standard, however, includes more than just the base code. Alongside the character encodings, the Consortium's official publication includes a wide variety of details about the scripts and how to display them: normalization rules, decomposition, collation, rendering, and bidirectional text display order for multilingual texts, and so on. The Standard also includes reference data files and visual charts to help developers and designers correctly implement the repertoire. Unicode can be stored using several different encodings, which translate the character codes into sequences of bytes. The Unicode standard defines three and several other encodings exist, all in practice variable-length encodings. The most common encodings are the ASCII-compatible UTF-8, the ASCII-incompatible UTF-16 (compatible with the obsolete UCS-2), and the Chinese Unicode encoding standard GB18030 which is not an official Unicode standard but is used in China and implements Unicode fully. Unicode（ユニコード）は、符号化文字集合や文字符号化方式などを定めた、文字コードの業界規格。文字集合（文字セット）が単一の大規模文字セットであること（「Uni」という名はそれに由来する）などが特徴である。従来、国あるいは各メーカーで独自に開発されていた文字コードには互換性がなかった。複数の文字コードを共存させる方法には文字が重複する短所があるため、微細な差異はあっても本質的に同じ文字であれば一つの番号を当てる方針で各国・各社の文字コードの統合を図ったものである。1980年代に、Starワークステーションの日本語化（J-Star）などを行ったゼロックスが提唱し、マイクロソフト、Apple、IBM、サン・マイクロシステムズ、ヒューレット・パッカード、ジャストシステムなどが参加するユニコードコンソーシアムにより作られた。国際規格のISO/IEC 10646とUnicode規格は同じ文字コード表になるように協調して策定されている。 (Source: http://dbpedia.org/resource/Unicode)