Swift筆記(一):Extended Grapheme Clusters

前言

CSDN地址:http://blog.csdn.net/game3108/article/details/52957669
最近在看蘋果官方swift文檔《The Swift Programming Language》,記錄一些筆記。

Extended Grapheme Clusters

swift使用了Extended Grapheme Clusters作為Character的顯示。內(nèi)容如下:

Every instance of Swift’s Character
type represents a single extended grapheme cluster. An extended grapheme cluster is a sequence of one or more Unicode scalars that (when combined) produce a single human-readable character.

具體的定義可以參考unicode的標(biāo)準(zhǔn)文檔Grapheme Cluster Boundaries

Unicode標(biāo)準(zhǔn)提供了算法去定義grapheme cluster boundaries,其中包括兩種變種:legacy grapheme clusters and extended grapheme clusters

A legacy grapheme cluster is defined as a base (such as A or カ) followed by zero or more continuing characters. One way to think of this is as a sequence of characters that form a “stack”.

An extended grapheme cluster is the same as a legacy grapheme cluster, with the addition of some other characters. The continuing characters are extended to include all spacing combining marks, such as the spacing (but dependent) vowel signs in Indic scripts.

具體的計(jì)算方式可以從文章中進(jìn)行詳細(xì)的了解。

舉例

拿一個(gè)Apple文檔里的例子:

let precomposed: Character = "\u{D55C}"                  // ?
let decomposed: Character = "\u{1112}\u{1161}\u{11AB}"   // ?, ?, ?
// precomposed is ?, decomposed is ???

韓文的音節(jié)可以拆分和組合,上面的兩個(gè)String就是相同的String。

因?yàn)檫@種編碼方式的問題,Swift想取一個(gè)String的字符個(gè)數(shù),需要使用"".characters.count的方式,獲取character,再獲取chara的個(gè)數(shù)。

Swift這邊的String用的是21bit Unicode scalar字符編碼方式(相當(dāng)于UTF-32),而OC中的NSString用的是UTF-16字符編碼方式。
所以對同一個(gè)String,轉(zhuǎn)化為NSString,可能獲得的長度方式也不同:

var str = "Hello ??" // the square is an emoji
str.characters.count // returns 7
(str as NSString).length // returns 8

就是現(xiàn)在所見非所得了,所以在處理swift string與nsstring轉(zhuǎn)化時(shí),要注意一下unicode的編碼和長度問題。

參考資料

1.The Swift Programming Language
2.Why is Swift counting this Grapheme Cluster as two characters instead of one?
3.Grapheme Cluster Boundaries

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

推薦閱讀更多精彩內(nèi)容

  • pyspark.sql模塊 模塊上下文 Spark SQL和DataFrames的重要類: pyspark.sql...
    mpro閱讀 9,514評論 0 13
  • Chapters: The Basics - Properties Excerpt From: Apple Inc...
    碧波浮沉閱讀 614評論 0 1
  • Initializing an Empty String To create an empty Stringval...
    YasuoYuHao閱讀 332評論 0 0
  • 曾經(jīng)看過一個(gè)視頻,《30秒看人的一生》:人從出生就開始一路狂奔,一秒鐘,眨眼之間,小嬰兒就變成了小學(xué)生,再轉(zhuǎn)眼就是...
    葉小秋123閱讀 320評論 1 7
  • 隨著現(xiàn)代經(jīng)濟(jì)的發(fā)現(xiàn),我們的時(shí)代正發(fā)生著日新月異的變化!再給我們帶來很多積極作用的同時(shí)產(chǎn)生了一些負(fù)面影響!熬夜多了,...
    影子幸福閱讀 210評論 0 0