In our last session together, one of the many items we talked about was the fact that I asked that you use the DATALENGTH function instead of the LEN function because you could be using UNICODE data.
The LEN function is used in SQL to find out – surprise, surprise – the length of a string.
Reviewing our NirvanaPerformances table and seeing the length of many of these fields, we see the following:
To get some UNICODE data, I’m going to create a new copy of this glorious table using only UNICODE data.
Notice that instead of the VARCHAR that you’re probably used to seeing, there is NVARCHAR in its place! (In the case of CHAR, there is now NCHAR!)
Looking at the definition of the tables, that’s really the only difference you’ll see. It’s when you look at the data, that you can see the real difference in using UNICODE data versus non-UNICODE (or ASCII) characters.
For future reference, all UNICODE characters can be found here.
Now, we run one command to copy all of the data that is currently in dbo.NirvanaPerformances into our new table.
INSERT INTO dbo.NirvanaPerformancesUnicode
SELECT *
FROM dbo.NirvanaPerformances
Nothing up my sleeve. No magic here.
Now, we look at our new table and the length (and datalength) of a few fields!
Notice that the new DataLength column is twice the size of the length column! The UNICODE characters take twice the amount of space as the ASCII character version!
I leave the checking of the length, and datalength, of the columns for the ASCII version of the table to the reader.
Point to Ponder:
- If you truly need the UNICODE characters in your data, go ahead and use them! If not though, please make your DBA happy by not using them. Since UNICODE characters take up twice the amount of space as the ASCII versions do, then your DBAs will recommend to use the ASCII versions if you are not going to be using any UNICODE characters.
So, hopefully, you know and understand a lot more about UNICODE data now. And with that – go forth and code!
1 comment