There are several things about this that shocked me:
- I encountered this bug over several months in two completely separate pieces of code, unrelated to each other at all.
- The bug is reproducible in SQL Server 2012, 2014, 2016, and 2017 (v11-v14).
- In light of the above, no one had noticed or reported this! Apparently Microsoft was unaware of the bug, and despite me running into it twice in two completely different contexts, no one else had? Just seems odd and not what I would expect given the extremely wide distribution of the product.
Let's reproduce! No, *ahem*, that is not intended as a strikingly unromantic pickup line, what I meant was, let's reproduce the problem in code, first the setup, of two input tables and an output table. Note the input table A has two values, both of which map to the shorter length values of input table B (col1 = 1,2):
USE tempdb
CREATE TABLE InputTblA (col1 INT, col2 CHAR(10));
CREATE TABLE InputTblB (col1 INT, col2 CHAR(10));
CREATE TABLE OutputTbl (col1 INT, col2 CHAR(5));
INSERT INTO InputTblA VALUES (1,'value'),(2,'value');
INSERT INTO InputTblB VALUES
(1,'short'),(1,'short'),(1,'short'),(1,'short'),
(2,'short'),(2,'short'),(2,'short'),(2,'short'),
(3,'longer'),(3,'longer'),(3,'longer'),(3,'longer');
So, now that we have those tables set up, let's do our insert.
INSERT INTO OutputTbl
SELECT a.col1, b.col2
FROM InputTblA a
INNER JOIN InputTblB b ON a.col1 = b.col1
We get, in this case, the expected and valid "(8 row(s) affected)". Looking at the execution plan, we see that it starts with InputTblA and then hash joins to InputTblB.
Normally SQL Server does this right, it's just when the table order gets just so that the bug shows up...so we must force it, using FORCE ORDER.
INSERT INTO OutputTbl
SELECT a.col1, b.col2
FROM InputTblB b
INNER JOIN InputTblA a ON a.col1 = b.col1
OPTION (FORCE ORDER)
Msg 8152, Level 16, State 14, Line 26
String or binary data would be truncated.
The statement has been terminated.
And here's the plan...we can see it scanned InputTblB and ran a compute scalar before joining to TblA which had the constraining data:
That's all fairly easy to duplicate and I went a step further, utilizing the latest feature of SQL 2019 backported to SQL 2017 CU12, the advanced truncation error message 2628 that actually calls out truncated columns and values. Once patched to CU12 I repeated my test with trace flag 460 enabled, and the error message was more descriptive:
Msg 2628, Level 16, State 1, Line 26
String or binary data would be truncated in
table 'tempdb.dbo.OutputTbl', column 'col2'. Truncated value: 'longe'.
The statement has been terminated.
It lists the truncated value that we know by definition is impossible, with that query and data, to be in the result set.
Having successfully hunted a Microsoft SQL Server Query Processor bug, I was pleased to notch my tag and cape the hide for my taxidermist, who will mount it for me for my living room wall. The product team at Microsoft has acknowledged to me (via the Premier Support rep I've been dealing with) that this is in fact a defect and they will be fixing it in a subsequent release (hopefully backported via CU or SP).