TSQL Concatenation

concatenationtsql

I often need to concatenate fields in TSQL…

Two issues TSQL forces you to deal with when using the '+' operator are Data Type Precedence and NULL values.

With Data Type Precedence, the problem is conversion errors.

1) SELECT 1 + 'B' = Conversion ERROR
2) SELECT 1 + '1' = 2
3) SELECT '1' + '1' = '11'

In 2), the varchar '1' is implicitly converted to an int, and the math works. However, in 1), the int 1 is NOT implicitly converted to a varchar. This is where DTP is (IMO) getting in the way. Essentially, it favors Math functions over String functions. I Wish 🙂 that DTP wasn't even a consideration in this case — why wouldn't the '+' operator be configured so that the operation could favor success over specific data-types? I wouldn't mind if it still favored MATH over String functions when possible — but why doesn't it favor String functions over Errors? (The only way to be successful in 1) is to treat it as a string function — so it's not like there's any ambiguity there.) Somebody at Microsoft thought that throwing an error in 1) would be more valuable to the programmer than treating the '+' as a string function. Why? And why didn't they provide a way to override it? (Or did they…that's really the heart of my question.) SET STRING_PREFERENCE ON would have been nice! 😛

In order to deal with this, you have to do more work — you have to explicitly convert the 1 to a varchar, using any number of different string functions — typically CAST/CONVERT, but also many others (like LTRIM()) will work.

Conversions become work-intensive when you deal with table fields when you don't know the data-type. This might work:

SELECT 'Fall '  + ' (' + [Term] + ')' -- Output: Fall (2011)

But then again, it might not. It just depends on what data-type of [Term] is. And to complicate that, the dba might change the dataype at some point without telling anyone (because it came as part of a big upgrade package once the vendor finally realized that there are only ever numbers stored in the [Term] field, or whatever reason).

So if you want to be a boyscount, you do this:

SELECT 'Fall '  + ' (' + LTRIM([Term]) + ')'

So now I'm running this LTRIM function every time, even though it might not be necessary, because I don't know the data-type of [Term] (OK — I can look that up, but that's almost like work, and I don't like interruptions while I'm coding 😛 *grump), and also, I don't know that the data-type will never change.

The second issue you have to confront with TSQL concatenation is how to deal with NULL values. For example, this would fail:

SELECT NULL + 'B'

So you need to do this:

SELECT 'Fall '  + ' (' + LTRIM(ISNULL([Term],'')) + ')'

What a pain — I wish I could just do this:

SELECT 'Fall '  + ' (' + [Term] + ')'

So I'm wondering if there are any (TSQL) ways to avoid having to do explicit data-type conversions and null checks on every field where I have to ensure the '+' operator behaves itself as I need it to.

Thanks!

EDIT
@a1ex07 came up with a great answer for working around the NULL issue (SET CONCAT_NULL_YEILDS_NULL OFF), but as I looked into it, it appears to be problematic as far as forcing stored procedures to re-compile every time they're executed.

Best Answer

SQL Server 2012 does have the CONCAT function which addresses all the issues you raise.

A good summary of the functionality is provided here by SQL Menace

CONCAT takes a variable number of string arguments and concatenates them into a single string. It requires a minimum of two input values; otherwise, an error is raised. All arguments are implicitly converted to string types and then concatenated. Null values are implicitly converted to an empty string. If all the arguments are null, then an empty string of type varchar(1) is returned. The implicit conversion to strings follows the existing rules for data type conversions