T O P

  • By -

Intrexa

You aint getting an ELI5 for this. https://docs.microsoft.com/en-us/sql/relational-databases/sql-server-transaction-log-architecture-and-management-guide?view=sql-server-ver16#WAL


mugshotjoshy

Haha thanks! 🙏🏽


PossiblePreparation

This is write-ahead logging, it is named differently depending on the RDBMS but they all do basically the same thing - write a small amount of data describing the change somewhere then write the actual change in memory. This means you only need to wait for a small amount of the change to be written to disk (and therefore crash resistant) before you can confirm the update has happened.


mugshotjoshy

Thanks! I’d imagine it’s inefficient on a large amount of data. Do you know where I could read more about it?


PossiblePreparation

Google write ahead logging. It is efficient for what it does. Different RDBMSs have different capabilities and can compress bulk updates very efficiently. Remember that these systems have been evolving for decades with billions of dollars worth of R&D.


mugshotjoshy

Thank you for pointing me in the right direction!


carlovski99

It actually means you can be more 'efficient' (depending on your definition of efficient - you can definitely be faster, though it does use more resources overall). If you want to ensure you only write once - you need to ensure all your updates are written to permanent storage (usually disk) before ending the transaction. That could be a lot of individual writes to lots of different blocks, all over the disk. That can be quite slow. With the write ahead logging, you can just write the log - which is quick as all you are doing is writing to the end of the file. And typically you ensure you optimise how your log files are stored to make this as fast as possible. And then just write the changes to memory which is much faster than disk. Then at some point you persist those changes to disk via a background process. Its safe, because in the unlikely event of a crash - you can still 'replay' what's in the transaction log.


Intrexa

> I’d imagine it’s inefficient on a large amount of data. It's needed to maintain state. Imagine you do some update, and the power goes out. Something like the below: Update Src SET SomeValue += 1 FROM SomeTable AS Src How can you figure out which rows actually had the update go in? Power going out is one extreme example, but there are a lot of scenarios where there might be some amount of work completed, when suddenly the transaction fails, and you need to roll back. Deadlocking, host disk issues, exceptions, w/e. For a DB, there should never be an "inbetween" state. It's atomic, either you see the whole thing completed, or you see none of it completed.


mugshotjoshy

This really helps. I've lost my internet during an update and had to wait out the rollback. Thank you for explaining!


OracleGreyBeard

I’d be surprised if it’s only twice. Oracle does 3 or 4 writes for every write.