Thread by @rakyll, My draft on "Things I wished developers knew about databases" is 80% [...]

My draft on "Things I wished developers knew about databases" is 80% complete. For the first time, I& #39;m going to publish more than 15 articles in an article.

This article is live at https://medium.com/@rakyll/things-i-wished-more-developers-knew-about-databases-2d0178464f78.">https://medium.com/@rakyll/t... It touches a variety of topics in databases developers get to know by experience, data loss and outages.

1. You are lucky if 99.999% of the time network is not a problem.

How reliable is the network nowadays?

This article is live at https://medium.com/@rakyll/t... It touches a variety of topics in databases developers get to know by experience, data loss and outages.1. You are lucky if 99.999% of the time network is not a problem.How reliable is the network nowadays?

2. ACID has many meanings.

One step is to learn what ACID means but another good step is to understand what ACID means for your database. there is a large spectrum of tradeoffs when implementing ACID properties and not everyone agrees on the implementation details.

3. Each database has different consistency and isolation capabilities.

Talking about ACID, consistency and isolation properties deserves their own section due to disparities. Martin Kleppmann’s research provides insights on how concurrency anomalies are handled differently.

4. Optimistic locking is an option when you can’t hold a lock.

Exclusive locks could be expensive and may cause single points of failure or high contention in hot-spots. Learn about optimistic locking and where it could be considered as an alternative.

5. There are anomalies other than dirty reads and data loss.

Concurrency anomalies are a large catalog although developers often only good at recognizing dirty reads and data loss. We don& #39;t necessarily examine a lot of the lesser well known anomalies but there are more.

6. My database and I don’t always agree on ordering.

What you see in your code base is not what you get. Transaction or operation might not be always easily determined by reading the source code. Especially the poor readability introduces surprising cases and causes bugs.

7. Application-level sharding can live outside the application.

One misconception is application-level sharding need to live within the application. Application-level sharding is just application-space sharding. Sharding can be a service, and often it is in large systems.

8. AUTOINCREMENT’ing can be harmful.

AUTOINCREMENT’ing is a common way of generating primary keys. It’s not uncommon to see cases where databases are used as ID generators and there are ID-generation designated tables in a database. I explain where it& #39;s better to generate PKs.

9. Stale data can be useful and lock-free.

Multi-version concurrency control (MVCC) allows databases to travel back in time (at least for a while). They might be providing lockless stale data. Learn about the use cases where stale data can be useful.

10. Clock skews happen between any clock sources.

Clock skews are everywhere and all timing APIs lie. We can& #39;t install atomic and GPS clocks everywhere and we gotta live with the fact we don& #39;t have accurate clocks. Learn approaches like TrueTime& #39;s and what they do differently.

11. Latency has many meanings.

If you ask ten people in a room what “latency” means, they may all have different answers. In databases, latency is often referred to “database latency” but not the latency client perceives.

12. Evaluate performance requirements per transaction.

Standardized benchmarking is not a healthy way to compare your expectations from a database. A a more comprehensive approach is to evaluate critical operations (per query and/or per transaction).

13. Nested transactions can be harmful.

Not every database supports nested transactions but when they do, nested transactions may cause surprising programming errors that are not always easy to identify until it becomes clear that you are seeing anomalies.

14. Transactions shouldn’t maintain application state.

Clients sometimes retry the transactions when networking issues happen. If a transaction is relying on state that is mutated elsewhere, it might pick the wrong value depending on the possibility of the data races.

15. Query planners can tell about databases.

Query planners can tell about your database as well as things it cannot estimate. Query planners have limited signals but they also surface their limitations. Then, you can tweak things to fine tune.

16. Online migrations are complex but possible.

Online, realtime or live migrations mean migrating from one database to another without downtime and compromising data correctness. They are possible with one small and reversible step at a time.

17. Significant database growth introduces unpredictability.

It is not the lack of knowledge of your database internals or your experience with your database that will fail you when your usage is growing significantly. Growth is going to impact everything surrounds your DB.

Latest Threads Unrolled: