When should IGNORE_DUP_KEY option be used on an index?Slow Performance Inserting Few Rows Into Huge TableDB2 10.5 - When inserting, the database is trying to reuse index keys on GENERATE ALWAYSAdd UNIQUE index fails with duplicate entry error, but no duplicates foundOracle: How to create secondary index based on every word in a CLOBWhy would SQL Server ignore an index?Are there practical scenarios where inserts + deletes are faster than updates? (SQL Server)Why don't I get an index seek?What to do when you need a cursor and you have a columnstore clustered indexHow to update column value by adding “1” based on a condition?

Professor refuses to write a recommendation letter to students who haven't written a research paper with him

What do English-speaking kids call ice-cream on a stick?

Golfball Dimples on spaceships (and planes)?

Balm of the Summer Court fey energy dice usage limits

Is there some sort of French saying for "a person's signature move"?

Are there mathematical concepts that exist in the fourth dimension, but not in the third dimension?

Infinitely many primes

Why would image resources loaded from different origins triggering HTTP authentication dialogs be harmful?

Was the lunar landing site always in the same plane as the CM's orbit?

How to interpret or parse this confusing 'NOT' and 'AND' legal clause

These roommates throw strange parties

Why doesn't an NVMe connection on an SSD make non-sequential access faster?

When should IGNORE_DUP_KEY option be used on an index?

Are language and thought the same?

Why does 8 bit truecolor use only 2 bits for blue?

I won a car in a poker game. How is that taxed in Canada?

Supervisor wants me to support a diploma-thesis SW tool after I graduated

Can my imp familiar still talk when shapshifted (to a raven, if that matters)?

Global variables and information security

What is the purpose of the rotating plate in front of the lock?

Can you create water inside someone's mouth?

What would happen if you robbed the momentum from a falling object?

Add builder hat to other people with tikzpeople

Do 643,000 Americans go bankrupt every year due to medical bills?

When should IGNORE_DUP_KEY option be used on an index?

Slow Performance Inserting Few Rows Into Huge TableDB2 10.5 - When inserting, the database is trying to reuse index keys on GENERATE ALWAYSAdd UNIQUE index fails with duplicate entry error, but no duplicates foundOracle: How to create secondary index based on every word in a CLOBWhy would SQL Server ignore an index?Are there practical scenarios where inserts + deletes are faster than updates? (SQL Server)Why don't I get an index seek?What to do when you need a cursor and you have a columnstore clustered indexHow to update column value by adding “1” based on a condition?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

Someone said it's preferable to craft your queries to avoid duplicate key exceptions, but I'm not convinced that's more performant that just setting IGNORE_DUP_KEY = ON for the index.

My goal is to ensure a row or set of rows exists for one or more users, before attempting to update those rows. I do this, so that when I attempt to update the row with an update statement like the one below, and no rows are affected, it's because the [Count] portion of predicate wasn't satisfied, as opposed to the row not existing at all (i.e. the [ID] portion of the predicate not being satisfied).

UPDATE [Inventory] SET [Count] = [Count] + 1 WHERE [ID] = 3 AND ([Count] + 1) <= @MaxInventory

I could run EXISTS(SELECT 1 From [Inventory] WHERE [ID] = 3 to check for that single row, and only insert the row if it does not exist. That simply avoids unnecessary inserts. The insert, if necessary, would still have to contend with concurrent transactions, so duplicate key exceptions can still occur.

I'm curious whether it's more performant to just turn on IGNORE_DUP_KEY in this scenario, rather than allowing the error to be thrown and caught. Specifically, I'm curious if it's as-fast or possibly even faster than running an exists check, to just attempt to insert the record and let it ignore duplicate keys.

This becomes even more important, when I'm checking for and initializing multiple records at once. For example, if I need to ensure records for thousands of users exist in a single update statement, the logic would be much simpler if I just ran that insert statement up-front, letting it ignore duplicate keys. Avoiding duplicates would be more complex, because I'd have to first query the table for which records don't exist, then attempt to add just those records (again, ignoring duplicate keys). Just inserting may be faster, even if all the records exist. I could meet it halfway and check whether ANY of the records are missing, such as with a left join or a COUNT comparison, but why bother if the insert ignoring dupliciate keys is just faster?

Is is a good idea to use IGNORE_DUP_KEY and just attempt inserts instead of bothering with checking for row existence ahead of time? If not, why?

edited 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

asked 8 hours ago

Triynko

2752 silver badges11 bronze badges

add a comment |

Someone said it's preferable to craft your queries to avoid duplicate key exceptions, but I'm not convinced that's more performant that just setting IGNORE_DUP_KEY = ON for the index.

UPDATE [Inventory] SET [Count] = [Count] + 1 WHERE [ID] = 3 AND ([Count] + 1) <= @MaxInventory

Is is a good idea to use IGNORE_DUP_KEY and just attempt inserts instead of bothering with checking for row existence ahead of time? If not, why?

edited 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

asked 8 hours ago

Triynko

2752 silver badges11 bronze badges

add a comment |

Someone said it's preferable to craft your queries to avoid duplicate key exceptions, but I'm not convinced that's more performant that just setting IGNORE_DUP_KEY = ON for the index.

UPDATE [Inventory] SET [Count] = [Count] + 1 WHERE [ID] = 3 AND ([Count] + 1) <= @MaxInventory

Is is a good idea to use IGNORE_DUP_KEY and just attempt inserts instead of bothering with checking for row existence ahead of time? If not, why?

edited 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

asked 8 hours ago

Triynko

2752 silver badges11 bronze badges

Someone said it's preferable to craft your queries to avoid duplicate key exceptions, but I'm not convinced that's more performant that just setting IGNORE_DUP_KEY = ON for the index.

UPDATE [Inventory] SET [Count] = [Count] + 1 WHERE [ID] = 3 AND ([Count] + 1) <= @MaxInventory

Is is a good idea to use IGNORE_DUP_KEY and just attempt inserts instead of bothering with checking for row existence ahead of time? If not, why?

sql-server index sql-server-2019

edited 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

asked 8 hours ago

Triynko

2752 silver badges11 bronze badges

edited 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

asked 8 hours ago

Triynko

2752 silver badges11 bronze badges

edited 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

edited 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

edited 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

asked 8 hours ago

Triynko

2752 silver badges11 bronze badges

asked 8 hours ago

Triynko

2752 silver badges11 bronze badges

asked 8 hours ago

Triynko

2752 silver badges11 bronze badges

add a comment |

1 Answer
1

active

oldest

votes

It's definitely an atypical setting to enable for an index. I don't think you're going to find many people who jump up and talk about using it.

Of course, there are a couple helpful posts on the subject by Paul White:

IGNORE_DUP_KEY slower on clustered indexes

A creative use of IGNORE_DUP_KEY

If your concern is around upsert patterns (or something similar), this article by Michael Swart is really informative.:

SQL Server UPSERT Patterns and Antipatterns

answered 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.

– Triynko
7 hours ago

I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.

– Triynko
7 hours ago

It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.

– Triynko
6 hours ago

One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.

– Triynko
5 hours ago

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "182"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f247090%2fwhen-should-ignore-dup-key-option-be-used-on-an-index%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

It's definitely an atypical setting to enable for an index. I don't think you're going to find many people who jump up and talk about using it.

Of course, there are a couple helpful posts on the subject by Paul White:

IGNORE_DUP_KEY slower on clustered indexes

A creative use of IGNORE_DUP_KEY

If your concern is around upsert patterns (or something similar), this article by Michael Swart is really informative.:

SQL Server UPSERT Patterns and Antipatterns

answered 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.

– Triynko
7 hours ago

I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.

– Triynko
7 hours ago

It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.

– Triynko
6 hours ago

One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.

– Triynko
5 hours ago

add a comment |

It's definitely an atypical setting to enable for an index. I don't think you're going to find many people who jump up and talk about using it.

Of course, there are a couple helpful posts on the subject by Paul White:

IGNORE_DUP_KEY slower on clustered indexes

A creative use of IGNORE_DUP_KEY

If your concern is around upsert patterns (or something similar), this article by Michael Swart is really informative.:

SQL Server UPSERT Patterns and Antipatterns

answered 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.

– Triynko
7 hours ago

I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.

– Triynko
7 hours ago

It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.

– Triynko
6 hours ago

One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.

– Triynko
5 hours ago

add a comment |

It's definitely an atypical setting to enable for an index. I don't think you're going to find many people who jump up and talk about using it.

Of course, there are a couple helpful posts on the subject by Paul White:

IGNORE_DUP_KEY slower on clustered indexes

A creative use of IGNORE_DUP_KEY

If your concern is around upsert patterns (or something similar), this article by Michael Swart is really informative.:

SQL Server UPSERT Patterns and Antipatterns

answered 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

It's definitely an atypical setting to enable for an index. I don't think you're going to find many people who jump up and talk about using it.

Of course, there are a couple helpful posts on the subject by Paul White:

IGNORE_DUP_KEY slower on clustered indexes

A creative use of IGNORE_DUP_KEY

If your concern is around upsert patterns (or something similar), this article by Michael Swart is really informative.:

SQL Server UPSERT Patterns and Antipatterns

answered 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

answered 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

answered 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

answered 8 hours ago

Erik Darling

27.3k13 gold badges83 silver badges138 bronze badges

In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.

– Triynko
7 hours ago

I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.

– Triynko
7 hours ago

It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.

– Triynko
6 hours ago

One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.

– Triynko
5 hours ago

add a comment |

In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.

– Triynko
7 hours ago

I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.

– Triynko
7 hours ago

It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.

– Triynko
6 hours ago

One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.

– Triynko
5 hours ago

In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.

– Triynko
7 hours ago

I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.

– Triynko
7 hours ago

It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.

– Triynko
6 hours ago

One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.

– Triynko
5 hours ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Database Administrators Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Xjyuk

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Sahara Skak | Bilen | Luke uk diar | NawigatsjuunCommonskategorii: SaharaWikivoyage raisfeerer: Sahara26° N, 13° O

19. јануар Садржај Догађаји Рођења Смрти Празници и дани сећања Види још Референце Мени за навигацијуу

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Sahara Skak | Bilen | Luke uk diar | NawigatsjuunCommonskategorii: SaharaWikivoyage raisfeerer: Sahara26° N, 13° O

19. јануар Садржај Догађаји Рођења Смрти Празници и дани сећања Види још Референце Мени за навигацијуу

1 Answer
1

1 Answer
1

1 Answer
1