Created
October 9, 2012 10:56
-
-
Save AMekss/3857957 to your computer and use it in GitHub Desktop.
String#normalize_encoding! for working with differently encoded strings in Ruby 1.9.3
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- encoding : utf-8 -*- | |
class String | |
# method always returns string with valid encoding which is equal to Encoding#default_internal | |
# handy for working with strings which encoding may differ (all kinds of imports and free text inputs) | |
def normalize_encoding! | |
return self unless !!defined?(Encoding) # apply for Ruby 1.9.3 only | |
encoding_equal_to_default_internal = (self.encoding == Encoding.default_internal) | |
# return unchanged if encoding is valid and equal to default_internal | |
return self if self.valid_encoding? && encoding_equal_to_default_internal | |
# try to force to default_internal and return if encoding is valid in result | |
return self.force_encoding(Encoding.default_internal) if self.dup.force_encoding(Encoding.default_internal).valid_encoding? | |
if encoding_equal_to_default_internal | |
# there might be a cases when encoding is the same as default_internal, but it's not valid | |
# so we need to force it to something else in order to make String#encode! method work. | |
non_default_encoding = Encoding.list.detect{|enc| enc != Encoding.default_internal} | |
self.force_encoding(non_default_encoding).encode! | |
else | |
self.encode!(Encoding.default_internal, self.encoding, { :undef => :replace, :invalid => :replace}) | |
end | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment