All Posts

Published on
June 12, 2022
Schema Evolution In RDBMS
database paper-reading distributed-system DDL
Database schema evolution (DDL) poses unique challenges in ensuring concurrency with transactions while maintaining ACID properties. Traditional approaches use metadata locks (MDLs) to serialize DDL and DML/DQL, but this can severely impact performance. This blog discusses the issues and explores solutions like multi-version concurrency control to enable highly concurrent non-blocking schema changes.
Published on
January 28, 2022
Fatal Distributed MDL Deadlock
database distributed-system lock
This blog discusses the fatal problem of distributed metadata lock (MDL) deadlocks that can occur when distributed transactions and database definition language (DDL) statements execute concurrently across multiple database nodes. It explains the causes and presents a solution for detecting these distributed MDL deadlocks.
Published on
October 24, 2021
Database Vectorization
database vectorization
I explore the motivation and techniques behind building a vectorized execution engine for distributed queries. Traditional tuple-at-a-time evaluation fails to utilize hardware efficiently at scale. By expressing queries as linear algebraic operations on batches of column vectors using generated kernels, significant performance gains can be achieved through improved data locality, reduced interpretation overhead and better utilization of CPU resources like SIMD units.
Published on
March 2, 2021
The Key to Automating Testing for Database Logic Bugs - Test Oracle
database testing
I explore approaches to automatically test databases for logic bugs that can severely impact applications and data integrity. Traditional testing struggles with database complexity, so I examine new techniques like Pivoted Query Synthesis, Non-Optimizing Reference Engine Construction and Ternary Query Partitioning. By generating and validating randomized queries, these methods can reveal hidden logic errors not discovered by conventional testing.
Published on
October 8, 2020
CockroachDB Distributed Transactions - A Fascinating Interplay of HLC and MVCC
database paper-reading transaction distributed-system
This blog post explores the fascinating interplay between Hybrid Logical Clocks (HLC) and Multi-Version Concurrency Control (MVCC) in enabling distributed transactions in CockroachDB. It covers how HLC addresses causality issues and how MVCC handles concurrency conflicts in a distributed setting.

All Posts

All Posts

Schema Evolution In RDBMS

Fatal Distributed MDL Deadlock

Database Vectorization

The Key to Automating Testing for Database Logic Bugs - Test Oracle

CockroachDB Distributed Transactions - A Fascinating Interplay of HLC and MVCC