Sql-server – When must we use NVARCHAR/NCHAR instead of VARCHAR/CHAR in SQL Server


Is there a rule when we must use the Unicode types?

I have seen that most of the European languages (German, Italian, English, …) are fine in the same database in VARCHAR columns.

I am looking for something like:

  1. If you have Chinese –> use NVARCHAR
  2. If you have German and Arabic –> use NVARCHAR

What about the collation of the server/database?

I don't want to use always NVARCHAR like suggested here
What are the main performance differences between varchar and nvarchar SQL Server data types?

Best Solution

The real reason you want to use NVARCHAR is when you have different languages in the same column, you need to address the columns in T-SQL without decoding, you want to be able to see the data "natively" in SSMS, or you want to standardize on Unicode.

If you treat the database as dumb storage, it is perfectly possible to store wide strings and different (even variable-length) encodings in VARCHAR (for instance UTF-8). The problem comes when you are attempting to encode and decode, especially if the code page is different for different rows. It also means that the SQL Server will not be able to deal with the data easily for purposes of querying within T-SQL on (potentially variably) encoded columns.

Using NVARCHAR avoids all this.

I would recommend NVARCHAR for any column which will have user-entered data in it which is relatively unconstrained.

I would recommend VARCHAR for any column which is a natural key (like a vehicle license plate, SSN, serial number, service tag, order number, airport callsign, etc) which is typically defined and constrained by a standard or legislation or convention. Also VARCHAR for user-entered, and very constrained (like a phone number) or a code (ACTIVE/CLOSED, Y/N, M/F, M/S/D/W, etc). There is absolutely no reason to use NVARCHAR for those.

So for a simple rule:

VARCHAR when guaranteed to be constrained NVARCHAR otherwise

Related Question