Skip to content

Instantly share code, notes, and snippets.

@zfz
Last active June 2, 2018 10:55
Show Gist options
  • Save zfz/efacfdf9a962be737e99 to your computer and use it in GitHub Desktop.
Save zfz/efacfdf9a962be737e99 to your computer and use it in GitHub Desktop.
过滤非中(简繁)日英字符
#!/usr/bin/env python
#-*- coding:utf-8 -*-
import re
regexp = ur"[\u2E80-\u9FFFa-zA-Z0-9]+"
assert re.match(regexp, u"愛美國愛臺灣") != None
assert re.match(regexp, u"打倒土共") != None
assert re.match(regexp, u"fuck GFW 打倒方校长!") != None
assert re.match(regexp, "fuck GFW 打倒方校长!") != None
assert re.match(regexp, "打倒方校长!") == None
assert re.match(regexp, u"fuck GFW") != None
assert re.match(regexp, "fuck GFW") != None
assert re.match(regexp, "🐸🐸🐸🐸") == None
assert re.match(regexp, u"日本を愛し") != None
assert re.match(regexp, u"!!!") == None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment