Alex's blog

Storing Unicode in an old Oracle database

I have a Java application that stores data in an old Oracle database. The database’s charset was originally set to Win-1252 and I don’t have the option of changing it.

It turns out Oracle has an alternative set of column types (NCHAR, NVARCHAR2, NCLOB) which can be used to store unicode in a database with a different encoding.

Converting columns

To convert a column we first add a new temporarily named column with the new type (e.g. NVARCHAR2, NCLOB). Delete the old column. Rename the temp column to the correct name.

ALTER TABLE items ADD (nametmp NVARCHAR2(256));
UPDATE items SET nametmp = TO_NCHAR(name);
ALTER TABLE items RENAME COLUMN nametmp TO name;

JDBC issues

To enable the thin JDBC driver to send NCHAR strings by default set connection property defaultNChar=true or system property oracle.jdbc.defaultNChar=true.

Even with this set I found Unicode characters could only be passed via bound parameters. String literals in queries, even with the N string flag like SET name = N'example' silently had their characters replaced.