In a recent project at work, we came across a quite dastardly problem that had to do with the bane of every programmers’ existence: incorrect character set translation. Some background here: We’re reading Excel sheets using the JExcelAPI, a Java library for parsing and writing Excel files (which, apart from this has been pretty darn good). We really need characters like ü, ã, ô, etc. because we’re hoity-toity and like the be correct when we spell Zürich, São Paulo, Düsseldorf, and Côte d’Ivoire. The last piece of the puzzle is that we’re using Excel for Macintosh.
So here’s what’s going wrong: Excel seems to always store its strings in Latin1, regardless of operating system. JExcel, however thinks that if it’s running on a Mac, it should convert strings from Mac Roman. The result is that when it converts to Unicode (which, as you well know, is the encoding Java naturally uses for its strings), it get’s things horribly mangled.
BTW, this good resource for seeing how all of the Unicode characters can be represented.
The solution is to set the
jxl.encoding system property to
So I just deployed a Rails application that used the RJB for the first time. Yeesh.
Here’s the thing: you of course need to have the environment variables set up in order to
gem install rjb. Unfortunately, Capistrano doesn’t use these environment variables in its commands. Without it, you get a notice like this:
can't create Java VM
Bummer. I ended up following these instructions. There’s probably a better way, but this seems to work.