Lessons from the Django Atomic Block Exception Handling

Today we dive into this obscure exception: Unhandled exception: An error occurred in the current transaction. You can’t execute queries…

Lessons from the Django Atomic Block Exception Handling

Today we dive into this obscure exception: Unhandled exception: An error occurred in the current transaction. You can’t execute queries until the end of the ‘atomic’ block.

What Happened

On a sunny Wednesday, we were excited to release a new version of our service to production, which has around 1k~5k active requests per second. As usual, we monitored for unexpected exceptions that happened, and this popped up a few times: Unhandled exception: An error occurred in the current transaction. You can’t execute queries until the end of the ‘atomic’ block.

The root cause was found quickly, a race condition existed in how one of the tables arehandled, but what troubled us was the strange message that was not helping at all!

Our Original Code

class A(Model): 
    id: int 
 
class B(Model): 
    id: int 
    a: OneToOneField(A, on_delete=models.CASCADE) 
 
def send_event(a): 
    print(f"{a.id} deleted") 
 
def delete_a(a): 
    with @transaction.atomic(): 
        try: 
            a.delete() 
            send_event(a) 
        except A.NotFound: 
            pass # it is ok 
        other_db_query()

Note a race condition could already happen, when 2 requests delete the same instance of A, it is likely that we would send the event twice.

A refactor that creates this obscure exception

We thought making send_event into a signal could help make the main business more clear:

class A(Model): 
    id: int 
 
class B(Model): 
    id: int 
    a: OneToOneField(A, on_delete=models.CASCADE) 
 
@receiver(post_delete, sender=B) 
def send_event(b): 
    a = b.a 
    print(f"{a.id} deleted") 
 
def delete_a(a): 
    with @transaction.atomic(): 
        try: 
            a.delete() 
        except A.NotFound: 
            pass # it is ok 
        other_db_query()

What we expected to happen upon a race condition should be

  1. Request 1 delete A instance, cascade delete B, triggers signal in send_event
  2. Request 2 delete the same A instance, does the same thing, but finishes first.
  3. In request 1, send_event should raise A.NotFound in line a = b.a
  4. The exception should be handled as disregarded in delete_a try-catch block.
  5. Other query should proceed as normal.

But in fact, we receive this Unhandled exception: An error occurred in the current transaction. You can’t execute queries until the end of the ‘atomic’ block. And the line of error happens in some other DB query. Why?

Django’s Official Warning

In Django’s official document: https://docs.djangoproject.com/en/5.1/topics/db/transactions/#controlling-transactions-explicitly

It states that one should Avoid catching exceptions inside atomic!

Why? Because inside an atomic session, when a DB query failed, it not only raises exception, but it mark the transaction state as being corrupted. This is so that during the __exit__ handling of a with transaction.atomic() context, it can properly rollback the session.

In our case, the exception raised from a = b.a is caught indeed! but the atomic session state has been marked as corrupted. So in some other DB query , we have this obscure exception which is not helpful, and took us a long time to find the root cause.

What should be done in this situation?

Following Django’s document suggestion, if we are to handle an exception within a atomic session, it is best to create a nested session and try catch it. This update will fix the issue

@receiver(post_delete, sender=B) 
def send_event(b): 
    try: 
        with transaction.atomic(): 
            a = b.a 
            print(f"{a.id} deleted") 
    except A.NotFound: 
        pass

Since a NotFound exception is not something that actually breaks the database, we let the inner session handles the rollback (which does nothing), and set the session state back to normal, some that later DB query can be executed normally.

Wrapping It All Up: The Art of Taming Transactions

In high-concurrency systems, even caught exceptions can corrupt transactions. Django’s atomic blocks demand careful handling — use nested transactions to isolate risks and follow framework warnings. Resilient code isn’t just functional; it’s designed for real-world complexity. Remember, every obscure error is a chance to learn and improve.

“Unless the Lord builds the house, those who build it labor in vain.”

Psalm 127:1

Read more

在優比快Cloud Team工作是什麼樣子

在優比快Cloud Team工作是什麼樣子

如果你正在找一份可以安安靜靜寫程式、不需要太多溝通的工作,老實說——Ubiquiti Cloud Team 可能不適合你。 年輕的工程師通常在意的是能不能學習、有沒有人帶;而資深工程師,則更看重領域的深度與發揮空間。這兩種我都理解,也都經歷過。在 Ubiquiti Cloud Team,工作確實不輕鬆,問題通常也不單純。但如果你追求挑戰、在意技術如何帶出產品價值,這裡就是個能讓你不斷磨練、逐步放大的舞台。 一些基本資訊先講清楚:我們使用 GitHub,開發環境現代化,雲平台該用的都有;團隊內部提供各種 AI coding 工具輔助日常開發(包括我本人非常依賴的 ChatGPT, Cursor 和 Claude Code);工作型態彈性大,遠端、無限假、健身補助。 一切從「真實世界的裝置」開始 Ubiquiti 跟多數純軟體公司不太一樣,我們的雲端服務是為了支援全球各地數以百萬計的實體網通設備:從 AP、

By schwannden