Revisions
-
luikore revised this gist
Jul 18, 2009 . 1 changed file with 11 additions and 11 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -16,18 +16,18 @@ class String # 27k chars CHINESE_GB_2000 = /^(?: [\xB0-\xF7][\xA1-\xFE] |[\x81-\xA0][\x40-\xFE] |[\xAA-\xFE][\x40-\xA0] |[\x81-\x82][\x30-\x39][\x81-\xFE][\x30-\x39] )+$/xn # 70k chars (including minorities chars) CHINESE_GB_2005 = /^(?: [\xB0-\xF7][\xA1-\xFE] |[\x81-\xA0][\x40-\xFE] |[\xAA-\xFE][\x40-\xA0] |[\x81-\x82][\x30-\x39][\x81-\xFE][\x30-\x39] |[\x95-\x98][\x30-\x39][\x81-\xFE][\x30-\x39] )+$/xn end -
luikore revised this gist
Jul 18, 2009 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -15,7 +15,7 @@ class String )+$/xn # 27k chars CHINESE_GB_2000 = /^(?: [\xB0-\xF7][\xA1-\xFE] |[\x81-\xA0][\x40-\xFE] |[\xAA-\xFE][\x40-\xA0] -
luikore created this gist
Jul 18, 2009 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,33 @@ # regexps to check if a string is pure chinese class String # 20k chars CHINESE_UCS2 = /^(?: [\x4e-\x9e][\x00-\xff] |\x9f[\x00-\xa5] )+$/xn # 20k chars CHINESE_UTF8 = /^(?: \xe4[\xb8-\xbf][\x80-\xbf] |[\xe5-\xe8][\x80-\xbf][\x80-\xbf] |\xe9[\x80-\xbd][\x80-\xbf] |\xe9\xbe[\x80-\xa5] )+$/xn # 27k chars CHINESE_GB_2005 = /^(?: [\xB0-\xF7][\xA1-\xFE] |[\x81-\xA0][\x40-\xFE] |[\xAA-\xFE][\x40-\xA0] |[\x81-\x82][\x30-\x39][\x81-\xFE][\x30-\x39] )+$/xn # 70k chars (including minorities chars) CHINESE_GB_2005 = /^(?: [\xB0-\xF7][\xA1-\xFE] |[\x81-\xA0][\x40-\xFE] |[\xAA-\xFE][\x40-\xA0] |[\x81-\x82][\x30-\x39][\x81-\xFE][\x30-\x39] |[\x95-\x98][\x30-\x39][\x81-\xFE][\x30-\x39] )+$/xn end