<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Gray Soft / Character Encodings / Encoding Conversion With iconv</title>
  <id>tag:graysoftinc.com,2014-03-20:/posts/72</id>
  <updated>2014-04-17T16:04:36Z</updated>
  <link rel="self" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv/feed.xml"/>
  <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv"/>
  <author>
    <name>James Edward Gray II</name>
  </author>
  <entry>
    <title>The 22nd Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_496"/>
    <id>tag:graysoftinc.com,2012-11-14:/comments/496</id>
    <updated>2014-04-17T16:04:36Z</updated>
    <summary>This is a good question.

`String#encode()` doesn&amp;#39;t really provide an equivalent option to `iconv`&amp;#39;s `//TRANSLIT`.  You can specify fallbacks, but you can&amp;#39;t ask Ruby to intelligently handle the replacements for you.

In this instance, `iconv` ...</summary>
    <content type="html">&lt;p&gt;This is a good question.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;String#encode()&lt;/code&gt; doesn't really provide an equivalent option to &lt;code&gt;iconv&lt;/code&gt;'s &lt;code&gt;//TRANSLIT&lt;/code&gt;.  You can specify fallbacks, but you can't ask Ruby to intelligently handle the replacements for you.&lt;/p&gt;

&lt;p&gt;In this instance, &lt;code&gt;iconv&lt;/code&gt; still feels superior to me.  It will be interesting to see how that is addressed as this feature is retired.&lt;/p&gt;

&lt;p&gt;Of course, they can't really take &lt;code&gt;iconv&lt;/code&gt; away from us.  We can always just shell out to the command-line program.  So, if you want to stick with &lt;code&gt;iconv&lt;/code&gt;, you can.&lt;/p&gt;</content>
    <author>
      <name>James Edward Gray II</name>
    </author>
  </entry>
  <entry>
    <title>The 21st Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_495"/>
    <id>tag:graysoftinc.com,2012-11-14:/comments/495</id>
    <updated>2014-04-17T16:04:36Z</updated>
    <summary>Help: 

I am using `Iconv.iconv(&amp;#39;ascii//TRANSLIT&amp;#39;, &amp;#39;utf8&amp;#39;, value)` and I get the warning message:

```
/usr/lib/ruby/gems/1.9.1/gems/activesupport-3.2.3/lib/active_support/dependencies.rb:251:in `block in require&amp;#39;: iconv will be deprecated in...</summary>
    <content type="html">&lt;p&gt;Help: &lt;/p&gt;

&lt;p&gt;I am using &lt;code&gt;Iconv.iconv('ascii//TRANSLIT', 'utf8', value)&lt;/code&gt; and I get the warning message:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;/usr/lib/ruby/gems/1.9.1/gems/activesupport-3.2.3/lib/active_support/dependencies.rb:251:in `block in require': iconv will be deprecated in the future, use String#encode instead
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;How do I have to use the method &lt;code&gt;String#encode&lt;/code&gt; to do the same?&lt;/p&gt;</content>
    <author>
      <name>Paul Emico</name>
    </author>
  </entry>
  <entry>
    <title>The 20th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_487"/>
    <id>tag:graysoftinc.com,2012-09-04:/comments/487</id>
    <updated>2014-03-27T01:38:28Z</updated>
    <summary>Indeed very helpful and was also exactly what I needed. Thanks a lot.</summary>
    <content type="html">&lt;p&gt;Indeed very helpful and was also exactly what I needed. Thanks a lot.&lt;/p&gt;</content>
    <author>
      <name>Andreas Fischlin</name>
    </author>
  </entry>
  <entry>
    <title>The 19th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_476"/>
    <id>tag:graysoftinc.com,2012-06-16:/comments/476</id>
    <updated>2014-04-17T16:02:38Z</updated>
    <summary>Thanks for this very informative post.  Best summary of `iconv` I could find!</summary>
    <content type="html">&lt;p&gt;Thanks for this very informative post.  Best summary of &lt;code&gt;iconv&lt;/code&gt; I could find!&lt;/p&gt;</content>
    <author>
      <name>Sean</name>
    </author>
  </entry>
  <entry>
    <title>The 18th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_467"/>
    <id>tag:graysoftinc.com,2011-12-14:/comments/467</id>
    <updated>2014-04-17T16:02:21Z</updated>
    <summary>I&amp;#39;m not sure exactly why you are seeing these oddities.  My suggestion is to try listing the available encodings with:

```
iconv -l
```

Make sure both of the encodings you are using are in that list.</summary>
    <content type="html">&lt;p&gt;I'm not sure exactly why you are seeing these oddities.  My suggestion is to try listing the available encodings with:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;iconv -l
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Make sure both of the encodings you are using are in that list.&lt;/p&gt;</content>
    <author>
      <name>James Edward Gray II</name>
    </author>
  </entry>
  <entry>
    <title>The 17th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_466"/>
    <id>tag:graysoftinc.com,2011-12-14:/comments/466</id>
    <updated>2014-04-17T16:02:21Z</updated>
    <summary>I tried the `UTF8` on the iconv1.14 and it did not work.  Only `UTF-8` worked.  And even then it ignored the `e&amp;#39;`.

```
iconv -t LATIN1//TRANSLIT -f UTF-8 &amp;lt; utf8.txt &amp;gt; latin1_wtranslit.txt
iconv: (stdin):1:1: cannot convert
```

Is there so...</summary>
    <content type="html">&lt;p&gt;I tried the &lt;code&gt;UTF8&lt;/code&gt; on the iconv1.14 and it did not work.  Only &lt;code&gt;UTF-8&lt;/code&gt; worked.  And even then it ignored the &lt;code&gt;e'&lt;/code&gt;.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;iconv -t LATIN1//TRANSLIT -f UTF-8 &amp;lt; utf8.txt &amp;gt; latin1_wtranslit.txt
iconv: (stdin):1:1: cannot convert
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Is there something wrong with my &lt;code&gt;iconv&lt;/code&gt; (compiled on SUN x86 using &lt;code&gt;gcc&lt;/code&gt;?&lt;/p&gt;</content>
    <author>
      <name>Roger Paul</name>
    </author>
  </entry>
  <entry>
    <title>The 16th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_451"/>
    <id>tag:graysoftinc.com,2011-07-17:/comments/451</id>
    <updated>2014-04-17T16:00:21Z</updated>
    <summary>Just wanted to say &amp;quot;Thank you!&amp;quot; for writing this piece and going through `iconv` in such depth. I&amp;#39;m doing some conversions, and this was exactly what I needed. Thanks!!</summary>
    <content type="html">&lt;p&gt;Just wanted to say "Thank you!" for writing this piece and going through &lt;code&gt;iconv&lt;/code&gt; in such depth. I'm doing some conversions, and this was exactly what I needed. Thanks!!&lt;/p&gt;</content>
    <author>
      <name>Patryk J</name>
    </author>
  </entry>
  <entry>
    <title>The 15th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_440"/>
    <id>tag:graysoftinc.com,2011-04-02:/comments/440</id>
    <updated>2014-04-17T15:59:58Z</updated>
    <summary>Thank you for this well-written post.  I had been using `iconv` for a project of mine, but didn&amp;#39;t know about the `//TRANSLIT` and `//IGNORE` options and needed to deal with a small handful of cases where the conversion I wanted was failing.  Your ...</summary>
    <content type="html">&lt;p&gt;Thank you for this well-written post.  I had been using &lt;code&gt;iconv&lt;/code&gt; for a project of mine, but didn't know about the &lt;code&gt;//TRANSLIT&lt;/code&gt; and &lt;code&gt;//IGNORE&lt;/code&gt; options and needed to deal with a small handful of cases where the conversion I wanted was failing.  Your post very quickly taught me what I had been looking for.  Kind regards and thank you.&lt;/p&gt;</content>
    <author>
      <name>steve s</name>
    </author>
  </entry>
  <entry>
    <title>The 14th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_419"/>
    <id>tag:graysoftinc.com,2011-01-06:/comments/419</id>
    <updated>2014-04-17T15:59:23Z</updated>
    <summary>Thanks for filling me in on the Windows solution.

I don&amp;#39;t think many people would agree with you on the current level of Windows support, but I am glad to hear that it is improving.</summary>
    <content type="html">&lt;p&gt;Thanks for filling me in on the Windows solution.&lt;/p&gt;

&lt;p&gt;I don't think many people would agree with you on the current level of Windows support, but I am glad to hear that it is improving.&lt;/p&gt;</content>
    <author>
      <name>James Edward Gray II</name>
    </author>
  </entry>
  <entry>
    <title>The 13th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_418"/>
    <id>tag:graysoftinc.com,2011-01-06:/comments/418</id>
    <updated>2014-04-17T15:59:23Z</updated>
    <summary>Thanks for the reply. I found the Windows installer:

[http://gnuwin32.sourceforge.net/packages/libiconv.htm](http://gnuwin32.sourceforge.net/packages/libiconv.htm)

I am also not a Windows guy ;) but I want our Ruby Software to run on Windows...</summary>
    <content type="html">&lt;p&gt;Thanks for the reply. I found the Windows installer:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://gnuwin32.sourceforge.net/packages/libiconv.htm"&gt;http://gnuwin32.sourceforge.net/packages/libiconv.htm&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I am also not a Windows guy ;) but I want our Ruby Software to run on Windows as well. Ruby on Windows is actually better supported then Ruby on OS X as far as I can tell. I find that fact interesting as I am working on Linux (and Mac) since 10 years now.&lt;/p&gt;</content>
    <author>
      <name>Zeno Davatz</name>
    </author>
  </entry>
  <entry>
    <title>The 12th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_416"/>
    <id>tag:graysoftinc.com,2011-01-06:/comments/416</id>
    <updated>2014-04-17T15:59:23Z</updated>
    <summary>I&amp;#39;m sorry, but I&amp;#39;m not a Windows guy and thus not the right person to answer that question.</summary>
    <content type="html">&lt;p&gt;I'm sorry, but I'm not a Windows guy and thus not the right person to answer that question.&lt;/p&gt;</content>
    <author>
      <name>James Edward Gray II</name>
    </author>
  </entry>
  <entry>
    <title>The 11th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_415"/>
    <id>tag:graysoftinc.com,2011-01-06:/comments/415</id>
    <updated>2014-04-17T15:59:23Z</updated>
    <summary>Interesting thank you for the post.

How would I use `iconv` on Windows with Ruby 1.8.6 and RubyGems on Windows Vista?

Best
</summary>
    <content type="html">&lt;p&gt;Interesting thank you for the post.&lt;/p&gt;

&lt;p&gt;How would I use &lt;code&gt;iconv&lt;/code&gt; on Windows with Ruby 1.8.6 and RubyGems on Windows Vista?&lt;/p&gt;

&lt;p&gt;Best&lt;/p&gt;</content>
    <author>
      <name>Zeno Davatzu</name>
    </author>
  </entry>
  <entry>
    <title>The 10th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_392"/>
    <id>tag:graysoftinc.com,2010-07-30:/comments/392</id>
    <updated>2014-04-17T15:55:50Z</updated>
    <summary>US-ASCII is a valid subset of UTF-8, so I&amp;#39;m guessing your data just doesn&amp;#39;t have any characters with higher-order bits set.  If that&amp;#39;s true, the file is valid US-ASCII, Latin-1, UTF-8, and more.  The `file` program just went with the simplest answer.</summary>
    <content type="html">&lt;p&gt;US-ASCII is a valid subset of UTF-8, so I'm guessing your data just doesn't have any characters with higher-order bits set.  If that's true, the file is valid US-ASCII, Latin-1, UTF-8, and more.  The &lt;code&gt;file&lt;/code&gt; program just went with the simplest answer.&lt;/p&gt;</content>
    <author>
      <name>James Edward Gray II</name>
    </author>
  </entry>
  <entry>
    <title>The 9th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_391"/>
    <id>tag:graysoftinc.com,2010-07-30:/comments/391</id>
    <updated>2014-04-17T15:55:50Z</updated>
    <summary>Hi.

I have the following code:

```ruby
file = File.new(&amp;quot;paymul1&amp;quot;, &amp;quot;w&amp;quot;)
data = Iconv.iconv(&amp;quot;utf-8&amp;quot;, &amp;quot;us-ascii&amp;quot;, ic.to_s).join
file.print data
```

I expected that the file charset was utf-8. But:

```ruby
$ file -i paymul1 
paymul1:...</summary>
    <content type="html">&lt;p&gt;Hi.&lt;/p&gt;

&lt;p&gt;I have the following code:&lt;/p&gt;

&lt;div class="highlight highlight-ruby"&gt;&lt;pre&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"paymul1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"w"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Iconv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iconv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"utf-8"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"us-ascii"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ic&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;
&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;print&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;I expected that the file charset was utf-8. But:&lt;/p&gt;

&lt;div class="highlight highlight-ruby"&gt;&lt;pre&gt;&lt;span class="err"&gt;$&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="n"&gt;paymul1&lt;/span&gt; 
&lt;span class="ss"&gt;paymul1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;plain&lt;/span&gt; &lt;span class="n"&gt;charset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;us&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ascii&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;the file charset was us-ascii.&lt;/p&gt;</content>
    <author>
      <name>Carlos</name>
    </author>
  </entry>
  <entry>
    <title>The 8th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_389"/>
    <id>tag:graysoftinc.com,2010-07-27:/comments/389</id>
    <updated>2014-04-17T15:53:25Z</updated>
    <summary>`Iconv` is smart and will accept either:

```
$ iconv --list | grep UTF8
UTF-8 UTF8
UTF-8-MAC UTF8-MAC
```</summary>
    <content type="html">&lt;p&gt;&lt;code&gt;Iconv&lt;/code&gt; is smart and will accept either:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ iconv --list | grep UTF8
UTF-8 UTF8
UTF-8-MAC UTF8-MAC
&lt;/code&gt;&lt;/pre&gt;</content>
    <author>
      <name>James Edward Gray II</name>
    </author>
  </entry>
  <entry>
    <title>The 7th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_388"/>
    <id>tag:graysoftinc.com,2010-07-27:/comments/388</id>
    <updated>2014-04-17T15:53:25Z</updated>
    <summary>I believe in your examples you mean `UTF-8` instead of `UTF8`

```ruby
utf8_to_latin1 = Iconv.new(&amp;quot;LATIN1//TRANSLIT//IGNORE&amp;quot;, &amp;quot;UTF8&amp;quot;)
```

should be

```ruby
utf8_to_latin1 = Iconv.new(&amp;quot;LATIN1//TRANSLIT//IGNORE&amp;quot;, &amp;quot;UTF-8&amp;quot;)
```

It might...</summary>
    <content type="html">&lt;p&gt;I believe in your examples you mean &lt;code&gt;UTF-8&lt;/code&gt; instead of &lt;code&gt;UTF8&lt;/code&gt;&lt;/p&gt;

&lt;div class="highlight highlight-ruby"&gt;&lt;pre&gt;&lt;span class="n"&gt;utf8_to_latin1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Iconv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"LATIN1//TRANSLIT//IGNORE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"UTF8"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;should be&lt;/p&gt;

&lt;div class="highlight highlight-ruby"&gt;&lt;pre&gt;&lt;span class="n"&gt;utf8_to_latin1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Iconv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"LATIN1//TRANSLIT//IGNORE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"UTF-8"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;It might be that more recent versions of Ruby are okay with &lt;code&gt;UTF8&lt;/code&gt; but I believe 1.8.x only accepted &lt;code&gt;UTF-8&lt;/code&gt; form&lt;/p&gt;</content>
    <author>
      <name>Todd</name>
    </author>
  </entry>
  <entry>
    <title>The 6th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_373"/>
    <id>tag:graysoftinc.com,2010-04-15:/comments/373</id>
    <updated>2014-04-17T15:50:48Z</updated>
    <summary>The `iconv` library is required yes.  You will want to Google some instructions for installing it on your platform since I don&amp;#39;t know what that is and I&amp;#39;m not an expert for all of them.

However, I don&amp;#39;t think that&amp;#39;s the issue you are seeing her...</summary>
    <content type="html">&lt;p&gt;The &lt;code&gt;iconv&lt;/code&gt; library is required yes.  You will want to Google some instructions for installing it on your platform since I don't know what that is and I'm not an expert for all of them.&lt;/p&gt;

&lt;p&gt;However, I don't think that's the issue you are seeing here.  Different encodings are supported on different platforms.  Try listing the supported encodings as shown in my post and make sure the one you want is in the list.&lt;/p&gt;</content>
    <author>
      <name>James Edward Gray II</name>
    </author>
  </entry>
  <entry>
    <title>The 5th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_372"/>
    <id>tag:graysoftinc.com,2010-04-15:/comments/372</id>
    <updated>2014-04-17T15:50:48Z</updated>
    <summary>Do I need to install `iconv` onto my PC to work ?
how to do it??

I added this:

```ruby
class String #in enviroment.rb (last lines)
  require &amp;#39;iconv&amp;#39; #this line is not needed in rails !
  def to_utf8(encoding) 
    Iconv.conv( &amp;#39;UTF-8&amp;#39;,&amp;quot;#...</summary>
    <content type="html">&lt;p&gt;Do I need to install &lt;code&gt;iconv&lt;/code&gt; onto my PC to work ?&lt;br&gt;
how to do it??&lt;/p&gt;

&lt;p&gt;I added this:&lt;/p&gt;

&lt;div class="highlight highlight-ruby"&gt;&lt;pre&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="c1"&gt;#in enviroment.rb (last lines)&lt;/span&gt;
  &lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s1"&gt;'iconv'&lt;/span&gt; &lt;span class="c1"&gt;#this line is not needed in rails !&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;to_utf8&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="no"&gt;Iconv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="s1"&gt;'UTF-8'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;but when I use it with &lt;code&gt;encoding=ISO8859-7&lt;/code&gt; it return blanks!!!&lt;br&gt;
any idea??&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;</content>
    <author>
      <name>SunnySan</name>
    </author>
  </entry>
  <entry>
    <title>The 4th Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_371"/>
    <id>tag:graysoftinc.com,2010-04-12:/comments/371</id>
    <updated>2014-03-27T01:38:27Z</updated>
    <summary>cool site.</summary>
    <content type="html">&lt;p&gt;cool site.&lt;/p&gt;</content>
    <author>
      <name>Rob Miller</name>
    </author>
  </entry>
  <entry>
    <title>The 3rd Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_270"/>
    <id>tag:graysoftinc.com,2009-04-12:/comments/270</id>
    <updated>2014-03-27T01:38:26Z</updated>
    <summary>Couldn&amp;#39;t agree more - thanks for a superb series so far.  I can&amp;#39;t believe I&amp;#39;m up on a Saturday night at 2:50am reading this entire series.  This is surprisingly engrossing stuff, really helps to connect the dots.  Thanks!</summary>
    <content type="html">&lt;p&gt;Couldn't agree more - thanks for a superb series so far.  I can't believe I'm up on a Saturday night at 2:50am reading this entire series.  This is surprisingly engrossing stuff, really helps to connect the dots.  Thanks!&lt;/p&gt;</content>
    <author>
      <name>Jim Tran</name>
    </author>
  </entry>
  <entry>
    <title>The 2nd Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_269"/>
    <id>tag:graysoftinc.com,2009-04-08:/comments/269</id>
    <updated>2014-04-17T15:48:22Z</updated>
    <summary>These posts are the most comprehensive treatment on `String`s for Ruby. Thank you very much James; you&amp;#39;ve just saved my Capstone project from doom. :)</summary>
    <content type="html">&lt;p&gt;These posts are the most comprehensive treatment on &lt;code&gt;String&lt;/code&gt;s for Ruby. Thank you very much James; you've just saved my Capstone project from doom. :)&lt;/p&gt;</content>
    <author>
      <name>Xabriel J. Collazo-Mojica</name>
    </author>
  </entry>
  <entry>
    <title>The 1st Comment on "Encoding Conversion With iconv"</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv#comment_235"/>
    <id>tag:graysoftinc.com,2008-12-11:/comments/235</id>
    <updated>2014-04-17T15:48:03Z</updated>
    <summary>Thanks for this. I have seen `iconv` on Unixy systems for a long time and never knew what it was all about. I really appreciate this series. Keep up the good work.</summary>
    <content type="html">&lt;p&gt;Thanks for this. I have seen &lt;code&gt;iconv&lt;/code&gt; on Unixy systems for a long time and never knew what it was all about. I really appreciate this series. Keep up the good work.&lt;/p&gt;</content>
    <author>
      <name>Tim Morgan</name>
    </author>
  </entry>
  <entry>
    <title>Encoding Conversion With iconv</title>
    <link rel="alternate" href="http://graysoftinc.com/character-encodings/encoding-conversion-with-iconv"/>
    <id>tag:graysoftinc.com,2008-12-08:/posts/72</id>
    <updated>2014-04-17T19:14:31Z</updated>
    <summary>This article covers the Ruby 1.8 system of converting between character encodings.</summary>
    <content type="html">&lt;p&gt;There's one last standard library we need to discuss for us to have completely covered Ruby 1.8's support for character encodings.  The &lt;code&gt;iconv&lt;/code&gt; library ships with Ruby and it can handle an impressive set of character encoding conversions.&lt;/p&gt;

&lt;p&gt;This is an important piece of the puzzle.  You may have accepted my advice that it's OK to just work with UTF-8 data whenever you have the choice, but the fact is that there's a lot of non-UTF-8 data in the world.  Legacy systems may have produced data before UTF-8 was popular, some services may work in different encodings for any number of reasons, and not quite everyone has embraced Unicode fully yet.  If you run into data like this, you will need a way to convert it to UTF-8 as you import it and possibly a way to convert it back when you export it. That's exactly what &lt;code&gt;iconv&lt;/code&gt; does.&lt;/p&gt;

&lt;p&gt;Instead of jumping right into Ruby's &lt;code&gt;iconv&lt;/code&gt; library, let's come at it with a slightly different approach.  &lt;code&gt;iconv&lt;/code&gt; is actually a C library that performs these conversions and on most systems where it is installed you will have a command-line interface for it.&lt;/p&gt;

&lt;p&gt;It's very easy to use the &lt;code&gt;iconv&lt;/code&gt; program.  Just always follow these three steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Tell &lt;code&gt;iconv&lt;/code&gt; the encoding you want it to write data out in, including any special translation instructions&lt;/li&gt;
&lt;li&gt;Tell &lt;code&gt;iconv&lt;/code&gt; the encoding data will be passed to it in&lt;/li&gt;
&lt;li&gt;Send the input into &lt;code&gt;iconv&lt;/code&gt; on &lt;code&gt;STDIN&lt;/code&gt; (or just list the files as arguments, if you prefer) and redirect &lt;code&gt;iconv&lt;/code&gt;'s &lt;code&gt;STDOUT&lt;/code&gt; to where you want output to be written&lt;/li&gt;
&lt;/ol&gt;&lt;p&gt;For example, let's say I have some UTF-8 data:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ echo "Résumé" &amp;gt; utf8.txt
$ wc -c utf8.txt 
       9 utf8.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;My terminal works in UTF-8, so that's the data &lt;code&gt;echo&lt;/code&gt; wrote into the file.  You can see that it's encoded now because we have nine bytes in the file (one each for &lt;code&gt;"R"&lt;/code&gt;, &lt;code&gt;"s"&lt;/code&gt;, &lt;code&gt;"u"&lt;/code&gt;, &lt;code&gt;"m"&lt;/code&gt;, and &lt;code&gt;"\n"&lt;/code&gt; plus two for each &lt;code&gt;"é"&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Here's how we would convert that data to Latin-1 using &lt;code&gt;iconv&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ iconv -t LATIN1 -f UTF8 &amp;lt; utf8.txt &amp;gt; latin1.txt
$ wc -c latin1.txt 
       7 latin1.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can see the conversion worked, because an &lt;code&gt;"é"&lt;/code&gt; is only one byte in Latin-1 and we dropped two bytes.&lt;/p&gt;

&lt;p&gt;Note my use of all three steps here:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;I used &lt;code&gt;-t LATIN1&lt;/code&gt; to set the &lt;i&gt;to&lt;/i&gt; encoding without any special translations&lt;/li&gt;
&lt;li&gt;I used &lt;code&gt;-f UTF8&lt;/code&gt; to set the &lt;em&gt;from&lt;/em&gt; encoding&lt;/li&gt;
&lt;li&gt;I used &lt;code&gt;&amp;lt; utf8.txt&lt;/code&gt; to pipe data in and &lt;code&gt;&amp;gt; latin1.txt&lt;/code&gt; to pipe data out of the program&lt;/li&gt;
&lt;/ol&gt;&lt;p&gt;Those are always the steps as I said before.&lt;/p&gt;

&lt;p&gt;You only need to know two more things about &lt;code&gt;iconv&lt;/code&gt;.  First, &lt;code&gt;iconv&lt;/code&gt; supports a truck load of encodings, including all of the common encodings I've been talking about in this series.  They vary some on different platforms though, so you will need to check what is available to you:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ iconv --list
ANSI_X3.4-1968 ANSI_X3.4-1986 ASCII CP367 IBM367 ISO-IR-6 ISO646-US
  ISO_646.IRV:1991 US US-ASCII CSASCII
UTF-8 UTF8
UTF-8-MAC UTF8-MAC
ISO-10646-UCS-2 UCS-2 CSUNICODE
UCS-2BE UNICODE-1-1 UNICODEBIG CSUNICODE11
UCS-2LE UNICODELITTLE
ISO-10646-UCS-4 UCS-4 CSUCS4
UCS-4BE
UCS-4LE
UTF-16
…
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Each line of that listing shows a single encoding.  The space separated lists on each line are all aliases for that encoding that &lt;code&gt;iconv&lt;/code&gt; will accept.  Thus that first long line that I had to break into two provides a bunch of aliases for US-ASCII.  We can also see by reading down a bit that &lt;code&gt;iconv&lt;/code&gt; will accept UTF8 or UTF-8.&lt;/p&gt;

&lt;p&gt;The last thing to know about &lt;code&gt;iconv&lt;/code&gt; is that it has some special translation modes.  To see those in action, let's work with a different piece of data:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ echo "On and on… and on…" &amp;gt; utf8.txt
$ cat utf8.txt 
On and on… and on…
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;That last character is an ellipsis or three dots all in one character.  Unicode has that character, but Latin-1 does not.  Let's see what happens if we try to convert the data now:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ iconv -f UTF8 -t LATIN1 &amp;lt; utf8.txt &amp;gt; latin1.txt

iconv: (stdin):1:9: cannot convert
$ cat latin1.txt 
On and on
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As you can see, I got an error when it reached the first occurrence of the problem character.  The &lt;code&gt;cat&lt;/code&gt; command also shows that it completely quit working there.&lt;/p&gt;

&lt;p&gt;That may be what you need, so you can tell a user you can't work with their data.  I often find though that I just need to do the best I can with the data that I have.  &lt;code&gt;iconv&lt;/code&gt;'s translation modes can help with that.&lt;/p&gt;

&lt;p&gt;First, you can ask &lt;code&gt;iconv&lt;/code&gt; to &lt;em&gt;ignore&lt;/em&gt; any characters that cannot be converted to the new encoding:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ iconv -t LATIN1//IGNORE -f UTF8 &amp;lt; utf8.txt &amp;gt; latin1_wignore.txt
$ cat latin1_wignore.txt 
On and on and on
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As you can see, we completed the entire translation that time, only losing the problematic characters.  The &lt;code&gt;//IGNORE&lt;/code&gt; sequence adds the translation mode.  Modes are always specified after the output encoding.  That's an improvement for sure, but it's possible to do even better in this case.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;iconv&lt;/code&gt; has another translation mode where it will try to &lt;em&gt;transliterate&lt;/em&gt; characters into an equivalent representation in the target encoding:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ iconv -t LATIN1//TRANSLIT -f UTF8 &amp;lt; utf8.txt &amp;gt; latin1_wtranslit.txt
$ cat latin1_wtranslit.txt 
On and on... and on...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This time, instead of dropping the ellipsis characters, &lt;code&gt;iconv&lt;/code&gt; replaced them with three full stops each.  It's not as fancy as the Unicode character, but it gets the job done and we do a good job of keeping the meaning of the data.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;//TRANSLIT&lt;/code&gt; can't convert absolutely everything you will see in the wild, so it's still possible to get errors when using it.  You can combine the modes though by specifying &lt;code&gt;//TRANSLIT//IGNORE&lt;/code&gt;.  That will ask &lt;code&gt;iconv&lt;/code&gt; to transliterate what it can and drop the rest.  Note that order does matter there, you need to be sure it tries transliteration before ignoring the character.&lt;/p&gt;

&lt;p&gt;You can also give &lt;code&gt;iconv&lt;/code&gt; specific translations for bytes it has trouble with.  I've never needed that level of control though and find the translation modes help me do more with less effort.  Have a quick browse through &lt;code&gt;man iconv&lt;/code&gt;, if you are curious.&lt;/p&gt;

&lt;p&gt;That's all you need to know about &lt;code&gt;iconv&lt;/code&gt;.  You are now a character conversion expert.  Congratulations.&lt;/p&gt;

&lt;p&gt;Of course, it would be nice to talk about how this affects Ruby.  Let's do that.&lt;/p&gt;

&lt;p&gt;The Ruby standard library is just like the program we've been playing with.  It just provides a method interface to the underlying C code.  To show that, here's the same conversion we started with:&lt;/p&gt;

&lt;div class="highlight highlight-ruby"&gt;&lt;pre&gt;&lt;span class="c1"&gt;#!/usr/bin/env ruby -wKU&lt;/span&gt;

&lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s2"&gt;"iconv"&lt;/span&gt;

&lt;span class="n"&gt;utf8&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Résumé"&lt;/span&gt;
&lt;span class="n"&gt;utf8&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;  &lt;span class="c1"&gt;# =&amp;gt; 8&lt;/span&gt;

&lt;span class="n"&gt;latin1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Iconv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"LATIN1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"UTF8"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;utf8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;latin1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;  &lt;span class="c1"&gt;# =&amp;gt; 6&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can see that the steps are exactly the same.  The first parameter is your target encoding and the second is the encoding your data is currently in.  You pass the data to convert in the last parameter and the return value of the call is the result.&lt;/p&gt;

&lt;p&gt;If you are going to do several conversions in a row, it's slightly easier to create an &lt;code&gt;Iconv&lt;/code&gt; instance and just reuse that:&lt;/p&gt;

&lt;div class="highlight highlight-ruby"&gt;&lt;pre&gt;&lt;span class="c1"&gt;#!/usr/bin/env ruby -wKU&lt;/span&gt;

&lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s2"&gt;"iconv"&lt;/span&gt;

&lt;span class="n"&gt;utf8_to_latin1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Iconv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"LATIN1//TRANSLIT//IGNORE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"UTF8"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;resume&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Résumé"&lt;/span&gt;
&lt;span class="n"&gt;utf8_to_latin1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iconv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resume&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;  &lt;span class="c1"&gt;# =&amp;gt; 6&lt;/span&gt;

&lt;span class="n"&gt;on_and_on&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"On and on… and on…"&lt;/span&gt;
&lt;span class="n"&gt;utf8_to_latin1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iconv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;on_and_on&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# =&amp;gt; "On and on... and on..."&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;That's all there is to it.  The &lt;code&gt;new()&lt;/code&gt; method builds an object that remembers the encodings you are converting and then you can call &lt;code&gt;iconv()&lt;/code&gt; (instead of the &lt;code&gt;conv()&lt;/code&gt; class method we used earlier) to convert data.&lt;/p&gt;

&lt;p&gt;When things go wrong, the Ruby interface will raise exceptions like &lt;code&gt;Iconv::InvalidEncoding&lt;/code&gt; or &lt;code&gt;Iconv::InvalidCharacter&lt;/code&gt;.  See &lt;a href="http://www.ruby-doc.org/stdlib-1.8.6/libdoc/iconv/rdoc/Iconv.html"&gt;the documentation&lt;/a&gt; for details.&lt;/p&gt;

&lt;p&gt;The Ruby 1.8 library does not provide a way to programatically list the supported encodings, which is one of the big reasons I started off showing you the command-line program instead.  You will need to check them there.  However, Ruby 1.9 adds a method for this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ ruby_dev -r iconv -r pp -ve 'pp Iconv.list'
ruby 1.9.0 (2008-10-10 revision 0) [i386-darwin9.5.0]
[["ANSI_X3.4-1968",
  "ANSI_X3.4-1986",
  "ASCII",
  "CP367",
  "IBM367",
  "ISO-IR-6",
  "ISO646-US",
  "ISO_646.IRV:1991",
  "US",
  "US-ASCII",
  "CSASCII"],
 ["UTF-8", "UTF8"],
…
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This concludes our tour of character encoding tools for Ruby 1.8.  In later posts, we will take a step back from all of this and examine what the problems with this system are.  That will pave the way for us to discuss the new m17n (multilingualization) code in Ruby 1.9.&lt;/p&gt;</content>
    <author>
      <name>James Edward Gray II</name>
    </author>
  </entry>
</feed>
