I made an unpleasant discovery at work with a Postgres database I work with. I ended up with duplicate records, even though I had a unique constraint on the table.
As an example, take this table:
CREATE TABLE test_table( id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY, a_id int NOT NULL, b_id int NOT NULL, c_id int NULL, CONSTRAINT uq_test_table UNIQUE (a_id, b_id, c_id) );
If I try to insert this record twice
INSERT INTO test_table(a_id, b_id, c_id) VALUES (1, 2, 3);
I’ll get a violation of the constraint for the second insert attempt.
But I can insert this record twice:
INSERT INTO test_table(a_id, b_id, c_id) VALUES (1, 2, NULL);
So there’s an issue with the nullable column and the constraint.
In SQL Server, the second set of inserts would have also raised a constraint violation. I knew that this isn’t compliant with the ANSI standard, but after years of working in SQL Server, I took this behavior for granted. The Postgres behavior makes sense, since we’re trying to compare NULL to NULL,so you can’t say the records are true duplicates.
So I needed a workaround to keep duplicate records out of my table, and to take NULLs into consideration.
Obviously, the best solution would be to not have nullable columns in the constraint, so that there were always values to evaluate. But in my case, I needed to allow for NULLs.
The best I came up with was to use a COALESCE for the nullable column, and have a default value. I couldn’t use a COALESCE in a unique constraint, but I could in a unique index.
CREATE UNIQUE INDEX ix_test_table_unique ON test_table (a_id, b_id, COALESCE(c_id, -1));
So this will allow me to replicate the behavior I’m accustomed to with SQL Server, and help to keep these ‘duplicate’ records out.