Skip to content

Instantly share code, notes, and snippets.

@carlosmcevilly
Forked from luikore/gist:149493
Created May 9, 2012 03:35

Revisions

  1. @luikore luikore revised this gist Jul 18, 2009. 1 changed file with 11 additions and 11 deletions.
    22 changes: 11 additions & 11 deletions gistfile1.rb
    Original file line number Diff line number Diff line change
    @@ -16,18 +16,18 @@ class String

    # 27k chars
    CHINESE_GB_2000 = /^(?:
    [\xB0-\xF7][\xA1-\xFE]
    |[\x81-\xA0][\x40-\xFE]
    |[\xAA-\xFE][\x40-\xA0]
    |[\x81-\x82][\x30-\x39][\x81-\xFE][\x30-\x39]
    )+$/xn
    [\xB0-\xF7][\xA1-\xFE]
    |[\x81-\xA0][\x40-\xFE]
    |[\xAA-\xFE][\x40-\xA0]
    |[\x81-\x82][\x30-\x39][\x81-\xFE][\x30-\x39]
    )+$/xn

    # 70k chars (including minorities chars)
    CHINESE_GB_2005 = /^(?:
    [\xB0-\xF7][\xA1-\xFE]
    |[\x81-\xA0][\x40-\xFE]
    |[\xAA-\xFE][\x40-\xA0]
    |[\x81-\x82][\x30-\x39][\x81-\xFE][\x30-\x39]
    |[\x95-\x98][\x30-\x39][\x81-\xFE][\x30-\x39]
    )+$/xn
    [\xB0-\xF7][\xA1-\xFE]
    |[\x81-\xA0][\x40-\xFE]
    |[\xAA-\xFE][\x40-\xA0]
    |[\x81-\x82][\x30-\x39][\x81-\xFE][\x30-\x39]
    |[\x95-\x98][\x30-\x39][\x81-\xFE][\x30-\x39]
    )+$/xn
    end
  2. @luikore luikore revised this gist Jul 18, 2009. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion gistfile1.rb
    Original file line number Diff line number Diff line change
    @@ -15,7 +15,7 @@ class String
    )+$/xn

    # 27k chars
    CHINESE_GB_2005 = /^(?:
    CHINESE_GB_2000 = /^(?:
    [\xB0-\xF7][\xA1-\xFE]
    |[\x81-\xA0][\x40-\xFE]
    |[\xAA-\xFE][\x40-\xA0]
  3. @luikore luikore created this gist Jul 18, 2009.
    33 changes: 33 additions & 0 deletions gistfile1.rb
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,33 @@
    # regexps to check if a string is pure chinese
    class String
    # 20k chars
    CHINESE_UCS2 = /^(?:
    [\x4e-\x9e][\x00-\xff]
    |\x9f[\x00-\xa5]
    )+$/xn

    # 20k chars
    CHINESE_UTF8 = /^(?:
    \xe4[\xb8-\xbf][\x80-\xbf]
    |[\xe5-\xe8][\x80-\xbf][\x80-\xbf]
    |\xe9[\x80-\xbd][\x80-\xbf]
    |\xe9\xbe[\x80-\xa5]
    )+$/xn

    # 27k chars
    CHINESE_GB_2005 = /^(?:
    [\xB0-\xF7][\xA1-\xFE]
    |[\x81-\xA0][\x40-\xFE]
    |[\xAA-\xFE][\x40-\xA0]
    |[\x81-\x82][\x30-\x39][\x81-\xFE][\x30-\x39]
    )+$/xn

    # 70k chars (including minorities chars)
    CHINESE_GB_2005 = /^(?:
    [\xB0-\xF7][\xA1-\xFE]
    |[\x81-\xA0][\x40-\xFE]
    |[\xAA-\xFE][\x40-\xA0]
    |[\x81-\x82][\x30-\x39][\x81-\xFE][\x30-\x39]
    |[\x95-\x98][\x30-\x39][\x81-\xFE][\x30-\x39]
    )+$/xn
    end