Bemi Blog

Announcing Our Pre-Seed Round

Arjun Lall — Fri, 27 Sep 2024 16:55:00 GMT

We’re excited to announce that Bemi has closed a pre-seed round led by Night Capital. Among the funds invested were Mucker Capital, Niche Capital, AngelList, Materialized View Capital, and several prominent angels. We’re grateful to our investors and supporters for believing in our mission.

We're on a mission to make Postgres data tracking infrastructure incredibly easy for developers. We have a team with deep technical knowledge in this space so that companies never have to go down the painful path of building and maintaining complex infrastructure with teams of data engineers, devops, database administrators, and software developers.

When Postgres Indexing Went Wrong

Arjun Lall — Mon, 23 Sep 2024 05:08:00 GMT

Indexing in Postgres seems simple, but it’s important to understand the basics of how it really works and the best practices for preventing system downtime.

TLDR: Be careful when creating indexes — a lesson I learned the hard way when concurrent indexing failed silently.

Critical incident

At a previous company, we managed a high-volume Postgres instance with billions of rows of transactional data. As we scaled, query performance became a key priority, and one of the first optimizations was adding indexes. To avoid downtime, we used CREATE INDEX CONCURRENTLY, which allows indexing large tables without locking out writes for hours. Initially, p99 query performance improved dramatically.

A few weeks later, another team launched a new feature that was built to rely heavily on the new index. Everything seemed routine—until the traffic spiked.

At first, the problem was subtle. A few queries took longer than expected. But within hours, the load began to spike. Query response times slowed to a crawl, and some requests were timing out.

We couldn’t immediately see why. The index was in place, a quick EXPLAIN ANALYZE confirmed it was being used. But users were still experiencing massive slowdowns, and we were on the brink of a full-scale production outage.

It wasn’t until we checked the server logs did we piece together what happened:

CREATE INDEX CONCURRENTLY idx_email_2019 ON users_2019 (email);
ERROR: deadlock detected
DETAIL: Process 12345 waits for ShareLock on transaction 54321; blocked by process 54322.

Concurrent indexing can fail (silently)

Concurrent indexing needs more total work than a standard index build and takes much longer to complete. It uses a 2 phase approach that helps avoid locking the table:

Phase 1: A snapshot of the current data gets taken, and the index is built on that.
Phase 2: Postgres then catches up with any changes (inserts, updates, or deletes) that happened during phase 1.

Since this process is asynchronous, the CREATE INDEX command might fail, leaving an incomplete index behind. An “invalid” index is ignored during querying, but this oversight can have serious consequences if not monitored.

postgres=# \d users_emails_2019
       Table "public.users_emails_2019"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
  ...   |            |           |          |
Indexes:
    "idx" btree (email) INVALID

In our case, the issue was amplified by the fact that our data was partitioned. The index had failed on some partitions but not others, leading to a situation where some queries were using the index while others were hitting unindexed partitions. This imbalance resulted in uneven query performance and significantly increased load on the system.

If we hadn’t caught it when we did, we would have faced a full-blown production outage, impacting every user on the platform.

Best practices for Postgres indexing

To help others navigate this terrain, here are some best practices for Postgres indexing that can prevent these issues:

Avoid dangerous operations

Always use the CONCURRENTLY flag when creating indexes in production. Without it, even smaller tables can block writes for unacceptably long, leading to system downtime. While CONCURRENTLY takes more CPU and I/O, the trade-off is worth it to maintain availability. Keep in mind that concurrent index builds can only happen one at a time on the same table, so plan accordingly for large datasets.

Monitor concurrent index creation closely

Don’t take successful index creation for granted. The system table pg_stat_progress_create_index can be queried for progress reporting while indexing is taking place.

postgres=# SELECT * FROM pg_stat_progress_create_index;
-[ RECORD 1 ]------+---------------------------------------
pid                | 896799
datid              | 16402
datname            | postgres
relid              | 17261
index_relid        | 136565
command            | CREATE INDEX CONCURRENTLY
phase              | building index: loading tuples in tree
lockers_total      | 0
lockers_done       | 0
current_locker_pid | 0
blocks_total       | 0
blocks_done        | 0
tuples_total       | 10091384
tuples_done        | 1775295
partitions_total   | 0
partitions_done    | 0

Manually validate indexes

If you don’t check your indexes, you might think you’re able to rely on them when you can’t. And although an invalid index gets ignored during querying, it still consumes update overhead. Common causes for index failures include:

Deadlocks: Index creation might conflict with ongoing transactions, leading to deadlocks.
Disk Space: Large indexes may fail due to insufficient disk space.
Constraint Violations: Creating unique indexes on columns with non-unique data will result in failures.

You can find all invalid indexes by running the following:

SELECT * FROM pg_class, pg_index WHERE pg_index.indisvalid = false AND pg_index.indexrelid = pg_class.oid;

You can also query the pg_stat_all_indexes and pg_statio_all_indexes system views to verify that the index is being accessed.

Fix invalid indexes

Invalid indexes can be recovered using the REINDEX command. It’s the same as dropping and recreating the index, except it would also lock out reads that attempt to use that index (if not specifying CONCURRENTLY). Note that CONCURRENTLY reindexing isn’t supported in versions below Postgres 12.

REINDEX INDEX CONCURRENTLY idx_users_email_2019;

If a problem occurs while rebuilding the indexes, it’d leave behind a new invalid index suffixed with _ccnew. Drop it and retry REINDEX CONCURRENTLY.

postgres=# \d users_2019
       Table "public.tab"
 Column |  Type   | Modifiers
--------+---------+-----------
 col    | integer |
Indexes:
    "users_emails_2019" btree (col) INVALID
    "users_emails_2019_ccnew" btree (col) INVALID

If the invalid index is suffixed with _ccold, it’s the original index that wasn’t fully replaced. You can safely drop it, as the rebuild has succeeded.

Create partition indexes consistently

Newly created partitioned tables or small tables (<100k) can easily just create indexes synchronously on the parent table, and it'd automatically propagate indexes to all partitions, including any newly created ones in the future.

CREATE INDEX idx_users_email ON users (email);

But it’s currently not possible to use the CONCURRENTLY flag when creating an index on the root partitioned table. What you should use instead is the ONLY flag. This tells the parent table to not apply the index recursively to children, so the table isn’t locked.

-- Create an index on the parent table (metadata only operation);
CREATE INDEX idx_users_email ON ONLY users (email);

This creates an invalid index first. Then we can create indexes for each partition and attach them to the parent index:

CREATE INDEX CONCURRENTLY idx_users_email_2019
    ON users_2019 (email);
ALTER INDEX idx_users_email
    ATTACH PARTITION idx_users_email_2019;

CREATE INDEX CONCURRENTLY idx_users_email_2020
    ON users_2020 (email);
ALTER INDEX idx_users_email
    ATTACH PARTITION idx_users_email_2020;

// repeat for all partitions

Only once all partitions are attached, the index for the root table will be marked as valid automatically. The parent itself is just a “virtual” table without any storage, but can serve to ensure all partitions maintain a consistent indexing strategy.

Check the query execution plan

Using the EXPLAIN ANALYZE command provides a comprehensive view of the query execution plan, detailing how Postgres processes your query. This breakdown is essential for verifying that the expected indexes are being utilized effectively.

EXPLAIN ANALYZE SELECT * FROM users_2019 WHERE email = 'arjun@bemi.io';

Index Scan using idx_users_email_2019 on users_2019  (cost=0.15..0.25 rows=1 width=48) (actual time=0.015..0.018 rows=1 loops=1)
  Index Cond: (email = 'arjun@bemi.io'::text)
Planning Time: 0.123 ms
Execution Time: 0.028 ms

Remove unused indexes

Sometimes the indexes we add aren’t as valuable as expected. To prune our indexes to optimize write performance, we can check which indexes haven’t been used:

select 
    indexrelid::regclass as index, relid::regclass as table 
from 
    pg_stat_user_indexes 
    JOIN pg_index USING (indexrelid) 
where 
    idx_scan = 0 and indisunique is false;

By implementing these best practices, you can avoid scary mistakes. Remember to monitor, validate, and understand the implications of your indexing strategy. The cost of overlooking these details can be significant, and a proactive approach will help you maintain a stable and efficient database.

At Bemi, we specialize in handling audit trails at large volumes, where storage optimization and the right indexing strategies are crucial. We have to deeply understand Postgres storage and indexing internals to ensure 100% reliability and performance. We’ve had to build out index health monitoring at scale and also automated safeguards to ensure indexes are always valid and queries optimized. In a future blog, I’ll share some of the internal performance tooling and tech we use under the hood.

But when Postgres indexing isn't enough to scale, check out the BemiDB GitHub repo for handling analytical workloads on Postgres.

It’s Time to Rethink Event Sourcing

Evgeny Li — Tue, 03 Sep 2024 14:24:04 GMT

I've always been fascinated by Event Sourcing (ES) and other Domain-Driven Design (DDD) concepts. At some point, I even built a prototype of an event-sourced system called LMAX, which handles 6M orders per second as a high-frequency trading platform.

Unfortunately, the traditional approach to implementing Event Sourcing comes with its own set of challenges. In this blog post, I’ll share new ideas on how to achieve 80% of the Event Sourcing benefits with 20% effort.

Event Sourcing is a unicorn idea that captivates many developers, but it is rarely adopted and implemented successfully.

Why Use Event Sourcing

At its core, Event Sourcing is a simple architectural design pattern. All data changes are recorded as an immutable sequence of events in an append-only store, which becomes the main source of truth for application data. That’s it.

Event Sourcing is a simple, yet powerful concept.

This design pattern provides many advantages:

Data integrity. Unlike typical CRUD (Create/Read/Update/Delete) systems, stored events can’t be modified to ensure data integrity.
Auditability. The append-only store of events represents an audit trail that make it easy to track and audit changes.
Traceability. Events contain the context such as the ‘what’, ‘when’, ‘why’ and ‘who’, so you can easily trace and verify transactions.
Compliance. Event store is a detailed log of all state changes, which is essential in regulated industries like finance, healthcare, etc.
Rollbacks. If the current state is lost or corrupted, you can rebuild it by replaying the immutable events.
Troubleshooting. The event store can be used for debugging and allows understanding the sequence of events leading to an issue.
Time travel. Event Sourcing enables time travel capabilities by allowing you to reconstruct the previous state at any point in time.
Enhanced analytics. It allows generating custom data representations (projections) to query historical data and identify patterns.
Scalability and performance. Events can be handled asynchronously, which can improve performance and scalability.

Traditional Event Sourcing system

Event Sourcing Examples

Most of us use existing event-sourced systems every day and can’t imagine living without them.

Git and bank ledger are frequently used Event Sourcing systems.

Bank ledger account

When you load information about your bank account, most online banks will show you recent ledger transactions, which represent event-sourced records of every money movement in your account.

The idea of recording ledgers as an Event Sourcing system was used way before computer systems were invented. Around 7000 years ago, ledgers were used to record lists of expenditures and goods traded on clay tablets, while temples were considered the banks of the time.

Clay tablets as bank ledgers

Version control system

Version control systems, such as Git, are examples of Event Sourcing systems. Commits represent code changes that are recorded sequentially and become the main source of truth.

Additionally, commits record information about ‘who’ made the change, ‘when’ the change happened, and ‘why’ it was made via a commit message.

Git as an Event Sourcing concept

This means that you can view a history of all changes, time travel by checking to a previous commit, rollback changes, troubleshoot issues by using a binary search, analyze code changes, and so on. You’ve got the idea.

Issues with Traditional Event Sourcing

While Event Sourcing has many benefits, it also comes with many disadvantages that prevent it from being adopted more widely.

Event Sourcing is a simple idea that is very hard to implement.

Big paradigm shift. It is a fundamentally different approach to data management that goes against commonly techniques such as RESTful APIs and UPDATE/DELETE database operations.
Extra dependencies. Implementing Event Sourcing usually requires introducing additional concepts, such as CQRS (Command and Query Responsibility Segregation), which need to handle events, take snapshots, and rebuild projections.
Steep learning curve. Event Sourcing introduces new concepts and patterns that developers might not be familiar with, which can require additional time to adapt to the event-centric approach and learn new tools.
Eventual consistency. Event processing at scale is generally done asynchronously, which requires rethinking how data is being accessed. For example, when a user submits a multi-step form, you won’t be able to show a summary with all saved information and will be required to just show a confirmation page in the UI instead.
Event versioning. As your system evolves, you’d need to change the format of your events to, for example, start storing additional information. And because events are immutable, you’d need strategies for maintaining backwards compatibility or migrating old events.
Storage and compute needs. Since all events are stored and never deleted, it requires more storage and compute resources, which typically involves implementing an event streaming system.
Expensive migration. If you have a large non-event-sourced system, transitioning to Event Sourcing can be a significant undertaking that requires changing almost the entire codebase, backfilling past events, and careful testing.
Upstart cost. There is lots of literature on Event Sourcing, but there are no universal and flexible frameworks that can work with any tech stack. That’s why most teams DIY and implement all additional code plumbing around commands, command handlers, validators, aggregates, and so on themselves.

Developer productivity over time with a CRUD system vs Event Sourcing

Is there a way to get most of the Event Souring benefits while avoiding its disadvantages?

The New Approach to Event Sourcing

The disadvantages of Event Sourcing listed above make it a complete nonstarter for most companies. Let's reconsider the traditional Event Sourcing approach by taking a closer look at how we use a version control system like Git.

Rethinking traditional Event Sourcing through the lens of a version control system

As you can see, with Git:

We just reuse the existing tool for different projects without trying to invent a wheel.
Continue editing and working with mutable code files instead of thinking how to construct diffs.
Occasionally contextualize and wrap changes into commits, a higher-level standardized data abstraction.
Get automatic state reconstruction that supports time traveling, rollbacks, and have a full audit log.

We can’t, however, blindly copy the Git model and apply it to build “Git for data”. The main reason is that Git commits are usually committed manually by developers, while data in applications is frequently changed automatically. Instead, we need to use a slightly different approach.

Change Data Capture, and its limitations

Change Data Capture (CDC) is a design pattern used to identify and capture changes made to data in a database in real-time. For example, when moving data from an online transaction processing (OLTP) database like PostgreSQL to an online analytical processing (OLAP) system like Snowflake, people typically use CDC to ingest changes and record them in a data warehouse.

{
   "table": "shopping_cart_items",
   "primary_key": 1,
   "operation": "UPDATE",
   "committed_at": "2024-09-01 17:09:15+00",
   "before":{
      "id": 1,
      "quantity": 1,
      ...
   },
   "after":{
      "id": 1,
      "quantity": 2,
      ...
   }
}

Captured change

We could continue performing CRUD operations in a regular database (behaves like the latest snapshot) without rearchitecting our application, use CDC to capture all data changes in the background and store them as an immutable audit log (behaves like an event store).

There is, however, one big fundamental difference between Event Sourcing and Change Data Capture:

Event Sourcing: Events reflect domain-related processes that happened at the application level. For example, “shopping item quantity increased”.
Change Data Capture: Changes reflect low-level data changes. For example, “a database row in a shopping_cart_items table with an ID 1 was updated”.

Similarities between CDC and version control systems

To bridge the gap and make database changes captured with CDC meaningful and consistent, we can use a couple of different approaches.

Approach 1: Outbox pattern with Change Data Capture

The Outbox pattern allows to atomically update data in a database and record messages that need to be sent in order to guarantee data consistency.

When performing regular database record changes, we can also insert event records in an “ephemeral” outbox table within the same transactions:

BEGIN;
  UPDATE shopping_cart_items SET quantity = 2 WHERE id = 1;
  UPDATE products SET in_stock_count = in_stock_count - 1 WHERE id = 123;
  INSERT INTO outbox_events (event_type, entity_type, entity_id, payload) VALUES (...);
COMMIT;

Inserting events using the Outbox pattern

After the transaction completes, the domain-specific events can be reliably captured by CDC and permanently stored in an event store similarly to a traditional Event Sourcing approach.

Event Sourcing using the Outbox pattern and Change Data Capture

With this approach, we get the simplicity of a typical CRUD system and the benefits of an immutable and consistent append-only event store derived from data changes with CDC.

Approach 2: Contextualized Change Data Capture

Another slightly simplified and more practical approach is to contextualize data changes in CDC pipelines without making any modifications to the underlying data structure and database queries.

With a database like PostgreSQL, it’s possible to pass additional context with queries that can only be visible by a CDC system. Here is a simple code example written in JavaScript using Prisma ORM:

setContext({
  // Event-related data
  eventType: 'SHOPPING_CART_ITEM_QUANTITY_UPDATED',
  entityType: 'SHOPPING_CART_ITEM',
  entityId: 1,
  quantity: 2,
  // Additional context
  userId: currentUser.id,
  apiEndpoint: req.url,
});

await prisma.shoppingCartItem.update({
  where: { id: 1 },
  data: { quantity: 2 },
});
await prisma.products.update({
  where: { id: 123 },
  data: { inStockCount: product.inStockCount - 1 },
});

After the changes are committed to the database, we can reliably capture them, stitch with the context, and store as audit trail records.

Event Sourcing using Change Data Capture and data change contextualization

With this approach, we can continue using CRUD operations and store all event data as context in an immutable and reliable audit trail. This allows us, for example, to query all events by a particular “Shopping Cart Item” and see all underlying data changes made as part of these events.

Conclusion

It’s time to rethink Event Sourcing and stop trying to reinvent the wheel every time we want to implement it in our applications.

In some regulated industries like accounting there are already well-established industry standards for using Event Sourcing in a form of a double-entry bookkeeping system, such as a General Ledger.

In 95% of other cases, you can get most of the Event Sourcing benefits by using Change Data Capture enriched with your domain-specific information. The Change Data Capture data design pattern allows to reliably track and record all data changes made in a database. This, in combination with the Outbox pattern or data change contextualization implemented in the application, allows you to achieve the Event Sourcing advantages mentioned at the beginning of this blog post.

It is possible to event-source any system by implementing Change Data Capture and enriching it with domain-specific information.

This essentially flips the paradigm and allows deriving an immutable log of domain-specific events from regular database changes. Note that the described approaches are not meant to replace the business layer in your application. You still need to think about your domain design and implement it in your code.

About us

If you need help with event-sourcing your system, check out Bemi. Our solution can help you enable automatic data change tracking for your database in a few minutes, integrate it with your ORM for data change contextualization, and have a full audit trail automatically stored in a serverless PostgreSQL database.

Event Sourcing via CDC vs traditional Event Sourcing

For scaling a centralized Postgres data store, check out the BemiDB Github repo.

Bemi achieves SOC 2 compliance

Arjun Lall — Wed, 28 Aug 2024 01:07:04 GMT

At Bemi, security and reliability have always been at the core of what we do. Long before we even considered a SOC 2 audit, we built our systems with security, encryption protocols, and processes that went well beyond the requirements. Here are some of the security features Bemi already had in place:

AES-256 storage encryption at rest
TLS in-transit encryption to protect database traffic
HTTPS in-transit encryption to encrypt all web traffic
Customers’ credentials protected with military-grade encryption algorithms
Restricted IP access rules and password credentials for destination databases
Static Bemi IPs for allowlisting a connection to source databases
Isolated internal network SSH tunnelling with certification encryption
Data and container level customer isolation
Monitoring and alerting at all stack layers
Continuous software vulnerability scanning

As we grew, we realized that transparency is just as important as having strong security in place. And for many of our customers, especially those with stringent legal and security requirements, an external audit is a crucial part of building that trust.

Why We Pursued SOC 2 Now

SOC 2 or Service Organization Controls 2 is a framework governed by the American Institute of Certified Public Accountants (AICPA). With a SOC 2 audit, an independent service auditor will review an organization’s policies, procedures, and evidence to determine if their controls are designed and operating effectively. A SOC 2 report communicates a company’s commitment to data security and protection of customer information.

We decided to pursue SOC 2 compliance because we wanted to make our commitment to security as clear as possible. We’ve always been open about our processes—just a few months ago, we open-sourced our codebase to give everyone a closer look at what we’ve built. In the same spirit of transparency, we recognized that an external SOC 2 audit would provide the additional assurance that larger companies’ legal and security teams look for. It’s another step in our ongoing investment in trust.

Our Journey to SOC 2 Certification

We partnered with Vanta, the leader in Trust Management, to automate the collection of our audit evidence. Vanta provides us with the strongest security foundation to protect our customer data.

Our audit firm, Advantage Partners, then stepped in to assess our controls. For the audit, Advantage Partners evaluated the controls we have in place and opined on their state. Shortly after our audit window ended, Advantage Partners drafted and issued our report.

While SOC 2 can be a big undertaking, our compliance partners greatly streamlined the process. The readiness period can take the most time but we were able to make compliance a priority to get audit ready in a matter of weeks versus months.

We also found it important to review the audit timeline with Advantage Partners, set an ideal audit date, and then work backwards to be ready in time. Now that controls are implemented, subsequent SOC 2 audits will be even more seamless.

Lessons We Learned

Focus on Improving Security Posture, Not Checking Boxes

Compliance isn’t a one-size-fits-all approach. It’s about continually improving security, not just meeting the minimum requirements. At Bemi, we’ve always seen security as an ongoing project, something that’s woven into the fabric of our company.

Start the Process Early

Implementing security measures is easier when you start early. We’ve always prioritized building secure infrastructure, which made our SOC 2 journey smoother. By embedding security in our processes from day one, we were able to meet SOC 2 standards without needing to overhaul our systems.

Security and Compliance Help Scale Your Business

SOC 2 compliance isn’t just about security—it’s also a business enabler. Many of our larger customers require vendor security reviews as part of their procurement process. With our SOC 2 report, we can move through these reviews more quickly, allowing us to scale faster and with greater trust.

The Right Partners Are Key

Choosing the right tools and audit partners is crucial. Vanta and Advantage Partners helped us navigate the SOC 2 process efficiently. Their expertise ensured that our journey to compliance was seamless, saving us time and effort.

Looking Ahead

We’re proud of what we’ve achieved, but this is just one step in our ongoing commitment to security. As we continue to grow, we’ll keep investing in the tools and processes that protect our customers and build trust. Achieving SOC 2 compliance is an important milestone, but it's part of a broader mission. We were already HIPAA compliant, ensuring that we meet strict standards for healthcare data protection. Moving forward, we'll continue to prioritize security and transparency, making Bemi a company you can rely on—both now and in the future.

How KLog Saved $200,000 by Switching to Bemi Audit Trails

Arjun Lall — Mon, 15 Jul 2024 06:00:00 GMT

KLog democratizes international cargo transportation with an intuitive digital platform. In addition to moving cargo, KLog offers a comprehensive solution for more efficient and hassle-free logistics management. With over 5,000 customers, including Hugo Boss, Crocs, Kensington, and Wrangler, KLog has become the de facto logtech in Latin America and around the world.

KLog was spending upwards of $200k in engineering resources to track data changes internally

KLog required a highly reliable data audit trail that included context such as the user ID, API endpoint, request payload, and GraphQL mutation name behind a change.

The company initially built a solution internally through a cross-collaborative effort between software and data engineering teams. The system involved first adding an updatedBy field on every Postgres table and creating an application layer middleware that set this field on every data change in serverless functions. From the AWS RDS instances, their team then generated logs using an AWS DMS task and, due to the sheer volume, attempted to store the data in Parquet files for post-processing in S3 buckets. An alternative approach companies consider when building DIY is using Debezium for change data capture.

The system eventually encountered breakdowns when it came to reading the data, ensuring consistent application context, and maintenance.

The DIY system got so complex that developers mentioned they were 'allergic' to that part of the codebase. There could have been lost upmarket deals since it took months to initially build.
— Álvaro Serrano, CTO KLog

KLog switched to Bemi and never looked back

KLog easily connected their Postgres databases to the Bemi platform and used the Prisma ORM integration to automatically handle the enrichment, structuring, and formatting of the lower-level database events from the Write-Ahead Logs. KLog was then able to use Bemi’s intuitive control plane UI to significantly reduce data volumes by tracking only relevant data changes through column and table-level filtering.

Due to the high reliability and accuracy of the platform, Bemi became the 100% source of truth at KLog. Beyond just an audit trail, KLog’s more than 40 customer success and operations team members use the Bemi UI multiple times a day as the definitive, immutable source of truth when troubleshooting shipment updates.

We love working with the Bemi team. Their customer service is incredible — responsive, knowledgeable, and always willing to go the extra mile. They’re a talented team, and we have full confidence in their expertise.
— Álvaro Serrano, CTO KLog

Future plans

Since the Bemi data is also easily consumable with ORM-specific libraries, KLog plans to later build products centered around the data, such as customer shipment feeds. With more internal AI applications injecting data into KLog’s platform, their operations teams are becoming auditors of data rather than data inputters, thanks to Bemi. Bemi plans to extend ORM integrations with functionality to consume data change events in real-time, allowing KLog to reliably also power their notifications, AI RAG system, and microservice communication in the future.

Bemi has been a game-changer for us, highly recommend!! We’re not in the business of tracking data changes and are now able to concentrate fully on our core logistics product.
— Álvaro Serrano, CTO KLog

Try out Bemi

If you want to use Bemi to track Postgres data changes, star Bemi on GitHub and try Bemi Cloud for free.

Choosing the Right Audit Trail Approach in Ruby

Evgeny Li — Wed, 01 May 2024 22:13:12 GMT

Ruby gems such as PaperTrail and Audited have been downloaded over a hundred million times and are becoming table stakes in many applications. The Ruby ecosystem offers a wide range of useful tools for building an audit trail, each with its respective pros and cons.

What is an audit trail?

An audit trail (audit log) is a chronological set of records representing documentary evidence of system activities. There are many use cases and benefits to having an audit trail, here are some examples:

Disaster recovery: selectively find and restore historical records
Customer observability: save time tracking customer activity
Regulatory compliance: track data access and simplify audits
Fraud detection: identify fraudulent or malicious user activity
Enterprise table stakes: allow monitoring activity in an organization
Engineering on-call: quickly understand reasons behind data changes

Ruby audit trail solutions

Let’s explore and compare the following approaches to building an audit trail and decide which one of these to choose:

Callback-based solutions: PaperTrail, Audited, Mongoid History
Trigger-based solutions: Logidze
Replication log-based solutions: Bemi Rails
Manual tracking: PublicActivity, Ahoy
Console command logging: Console1984, Audits1984
Custom logging: Marginalia

Callback-Based Solutions

PaperTrail and Audited are very popular gems that integrate with the ActiveRecord object-relational mapper (ORM) by using model callbacks to allow auditing data changes.

When a record is created, updated, or deleted, they insert an additional record that stores changes in a single audit table. This table stores the before/after state in JSON or JSONB formats and a reference pointing to the original record.

This approach is implemented purely on the application level and can be easily enabled for any ActiveRecord-supported database such as PostgreSQL, MySQL, or SQLite.

class MyModel < ApplicationRecord
  has_paper_trail
end

For MongoDB, Mongoid History gem works similarly and integrates with Mongoid, the officially supported object-document mapper (ODM).

They also allow passing and storing application-specific context with changes, such as a user who performed the changes or an API request where the changes were triggered:

# User
Audited.audit_class.as_user(current_user) do
  # Additional context
  audit_comment = { endpoint: "#{request.method} #{request.path}" }.to_json

  my_record.update!(published: true, audit_comment: audit_comment)
end

Pros

Easy to get started. An audit trail can be enabled by installing a gem and configuring it with a few lines of code.
Customization. It’s possible, for example, to use custom serializers for the before/after state or add a complex condition for disabling tracking.

Cons

Reliability and accuracy. Many ActiveRecord methods such as delete, update_column, update_all, delete_all, and so on don’t trigger callbacks. Thus, changes produced by these methods can’t be tracked. Additionally, inserting data changes does not always happen atomically, which may lead to data loss and inconsistency if, for example, there is a network issue.
Performance. The database workload increases by roughly 2x because each single record change produces an additional database query that inserts an audit record. This affects the application and database performance.
Scalability. A single audit table can get very large. I’ve seen cases where such tables ran out of integers used for primary keys. A large table makes it harder to manage and query at scale while also significantly increasing database resource usage and costs.

Trigger-Based Solutions

Logidze leverages the PostgreSQL triggers functionality and creates a new log_data JSONB column in each auditable table.

When a record is created or updated, PostgreSQL executes a row-based trigger which takes the current values of the record and appends them in the log_data column in a separate SQL query within the same transaction behind the scenes. Here is an example of the log_data:

{
  "v": 2, // current record version,
  "h": [  // list of changes
    {
      "v": 1,                          // change version
      "ts": 1460805759352,             // change timestamp
      "c": { "published": false }      // new values
      "m": {
        "_r": 42,                      // User ID
        "endpoint": "POST /my_records" // Additional context
      }
    },
    ...
  ]
}

It also allows passing and storing application-specific context with ActiveRecord changes, for example:

# User ID
Logidze.with_responsible(current_user.id) do
  my_record.update!(published: true)
end

# Additional context
Logidze.with_meta({ endpoint: "#{request.method} #{request.path}" }) do
  my_record.update!(published: true)
end

Pros

Improved performance. It can be almost 100% faster than callback-based solutions for record inserts and about 10% faster for updates. It still makes additional database queries on each record change, but they’re triggered on the database level skipping ActiveRecord.

Cons

No delete tracking. Because audit logs are attached and stored with an original record, deleting the record will lead to losing its entire change history. To overcome this limitation, you can use callback-based solutions designed for marking records as soft-deleted and ignoring them when querying, such as Paranoia, Discard, or ActsAsParanoid. But if you decide to use them, be careful and make sure to read our blog post about their danger and how they can lead to critical incidents.
Data structure. The data structure can be hard to work with and query directly. The field names in JSON are shortened to 1-2 characters to save disk space but this worsens the readability. Selecting records with the included JSON can significantly decrease database query performance because of PostgreSQL TOAST, so be careful with SELECT * ... SQL statements. It’s also difficult to construct, for example, a timeline of all changes across multiple records without knowing and fetching them in advance.
Complexity. Understanding and changing the code for complex triggers with hundreds of lines in SQL can be challenging. Just a single mistake in an SQL function can break all queries. The context passing can also be tricky. For example, if you use PostgreSQL with a connection pooler such as PgBouncer, you need to wrap your queries into a transaction because Logidze relies on PostgreSQL local parameters. But at the same time, if you use transactions, it’s impossible to pass application context to changes that are triggered after “commit” Rails callbacks.

Replication Log-Based Solutions

Bemi Rails uses the native PostgreSQL replication log called Write-Ahead Log (WAL) which records all changes before they are flushed on a disk.

Traditionally, the PostgreSQL WAL is used for data recovery after a database crash by replaying records or replicating changes to standby read replicas. Bemi uses the same functionality:

Bemi Core connects to the PostgreSQL WAL like a standby replica. It ingests and logically decodes all changes asynchronously and then stores them with the before/after states.
Bemi Rails allows setting the application context and passing it directly to the WAL with data changes without the need to update the database structure.

# Custom context
Bemi.set_context(
  user_id: current_user.id,
  endpoint: "#{request.method} #{request.path}",
)

# Data change that will be passed with the context into the PostgreSQL WAL
my_record.update!(published: true)

Pros

Reliability and accuracy. The PostgreSQL WAL is the source of truth for all data changes. Data changes will be captured even when they are produced by executing a direct SQL query within or outside the application.
Performance. Audit logs are ingested asynchronously without affecting runtime performance. The application context is passed to the WAL directly from the application, but it has a minimal performance impact since it doesn’t get stored and processed as regular PostgreSQL data.

Cons

Infrastructure complexity. Ingesting logically decoded changes requires running a separate worker process that connects to the database’s replication log. This can be similar to or even more challenging than trying to run a self-managed database replica instance in a cluster. For example, this solution requires creating a replication slot and maintaining the ingested position in the WAL, implementing heartbeats, ingesting and serializing logically decoded WAL records, stitching them with application context, etc.
Scalability. Similarly to the callback-based solutions, all audit records are stored in a single table. At scale, this table can become difficult to query and costly to manage.

Full disclosure: I’m one of the Bemi core contributors. Check out our Bemi.io cloud platform if you want to enable an automatic audit trail without the need to manage the infrastructure and deal with scalability issues yourself.

Manual Tracking

PublicActivity is a gem similar to callback-based solutions that track data changes. Its main difference is that it also allows creating custom activity events for database records that can be serialized and translated with Rails i18n.

my_record.create_activity(
  key: 'my_model.commented_on',
  owner: current_user
)

Ahoy allows tracking and collecting analytics data in a Ruby on Rails application. It is similar to, for example, automatic page visit tracking in Google Analytics. But it can also record custom events in controllers.

def update
  ahoy.track('Updated', endpoint: "#{request.method} #{request.path}")
  MyModel.find(params[:id]).update!(published: true)
end

Pros

Versatility. Creating custom audit trail records manually can be useful when it is necessary to record activities that didn’t change data or didn’t map 1-to-1 to records’ data changes.

Cons

Cumbersomeness. These solutions require manually triggering all actions that need to be recorded and updating the codebase in many places. This can be time-consuming and can increase code complexity. It is also easy to forget to trigger the right action, which may lead to an incomplete audit log.
Flexibility. While these solutions give some manual control, they are limited either to controller actions or ActiveRecord models. This may not be flexible enough to record all system activities. For example, recording that a background processing job made changes via an API request in an external system, such as a payment processing service.

Console Command Logging

Console1984 forces developers to specify a reason when they load a Rails console and record it with all executed console commands by storing them in a database.

$ rails c

Bob, why are you using this console today?
> Migrating customer data, see ticket #781923

> user = User.find(...)
...

In a Rails console, it breaks down a session into two access modes. One is the regular “protected” mode available after specifying a Rails console access reason. The other one is the “sensitive” mode which requires additional explicit consent when accessing sensitive information, such as executing a method that decrypts sensitive data or making external HTTP requests.

It also comes with the UI after installing the Audits1984 gem. This gem allows reviewing console sessions by approving or flagging them and leaving comments.

Pros

Auditable console sessions. Manually executed commands by developers can automatically be logged and reviewed later.

Cons

Loose control. In Ruby, it is very easy to modify any class and method definitions dynamically. It means that someone who has access to a Rails console can find workarounds and execute some commands that won’t be logged. To make logging more reliable and improve internal controls, teams may want to disable production console access and build workflows for running only pre-approved scripts.

Custom Logging

Ruby on Rails logging functionality allows logging anything in any text format.

payload = {
  user_id: current_user.id,
  endpoint: "#{request.method} #{request.path}",
}

Rails.logger.info("CUSTOM_LOG_MESSAGE: #{payload.to_json}")

Starting with Ruby on Rails version 7, previously with the Marginalia gem, it is also possible to pass custom application context via ActiveSupport::CurrentAttributes with ActiveRecord logs.

Current.user = current_user.id
Current.endpoint = "#{request.method} #{request.path}"

config.active_record.query_log_tags = [
  {
    user_id: -> { Current.user.id },
    endpoint: -> { Current.endpoint },
  },
]

MyRecord.all
# Account Load (0.3ms)  SELECT `my_records`.* FROM `my_records`
# /*user_id:1,endpoint:POST /my_records*/

Pros

Flexibility. These are the most flexible solutions that allow recording activities in custom format in text logs.

Cons

Consumption. Collecting unstructured text logs across all application instances and consuming them might be challenging. Depending on the use case, it might be required to clean up the logs, parse them, and save them in some data storage in a more structured format that allows quick lookups with filters. It might also be required to aggregate the logs by a transaction, an API request, etc. For example, if one log entry says that a record was created by a user, there could be another entry that says that this record creation was not committed and was rolled back.

Conclusion

There is a large number of Ruby gems available that can help with building an audit trail. As a rule of thumb, you can choose the right tool or a combination depending on your needs:

Basic change tracking: if you need it for troubleshooting purposes, you can use Audited, which also allows automatically deleting all audit records keeping only the last N numbers of changes.
Change diffing and rollbacks: for basic change tracking with additional features for querying them, PaperTrail is your best choice.
MongoDB and Mongoid: if you use Mongoid object-document mapper, then go with Mongoid History.
Performance over deletion tracking and simplicity: if you use PostgreSQL, then you can choose Logidze.
Reliability with zero performance overhead: if you use PostgreSQL and need complete change tracking accuracy and reliability without runtime overhead, go with Bemi Rails.
Simple activity feed UIs: if you need to build a simple activity feed constructed around your records, then go with PublicActivity, which also supports i18n for multi-language interfaces.
HTTP request tracking: if you need to track HTTP requests in a structured format, then Ahoy is your best choice.
Console session auditing: if you need to log and audit the commands executed in Rails consoles, go with Console1984 and Audits1984.
Troubleshooting recent issues: you can use application logs to troubleshoot issues and use Marginalia to automatically annotate log entries to add more context.

Also check out the new Ruby Rogues podcast episode where we talk about tools, patterns, and techniques for data recovery in more detail.

How Change Data Capture Powers Modern Apps

Arjun Lall — Thu, 11 Apr 2024 20:29:16 GMT

At its core, Change Data Capture (CDC) is a method used to track insert, update, and delete operations made to a database.

https://blog.bytebytego.com/p/ep92-top-5-kafka-use-cases

CDC is becoming an increasingly popular software pattern, with dev tooling startups centered around CDC such as Airbyte and Fivetran having cumulatively raised nearly a billion dollars in funding in recent years. The surge in CDC's popularity begs the questions: why has it become so important to today’s developer, and how does it work?

Why now?

CDC isn’t exactly new, but its surge in popularity can be attributed to a few key reasons.

Data fragmentation and growth

Not only is the amount of data that applications now generate exploding, but the data is increasingly fragmented between various isolated databases, making it a nightmare to keep everything in sync. CDC captures changes across independent data sources, allowing you to unify your data and ensure everyone has the correct info.

Exponential growth of data volumes

Real-time demands

Application data typically flows downstream to a data warehouse periodically on a schedule to be processed for analytics. Today's applications need to react to data changes as they happen, not wait for batch updates. For example, to be able to make faster decisions by not having stale data on dashboards. Since CDC lets you react to changes as they happen, it enables real-time analytics and event-driven architectures.

https://www.scylladb.com/glossary/event-driven-architecture/

Ease of adoption

The CDC infrastructure ecosystem has matured to a point where it's now practical for companies at all stages. Open-source projects like Debezium and Kafka have made it easier to build systems that continuously react to data changes. These tools provide the robustness, scalability, and performance needed to process and distribute large volumes of change data in real-time. As CDC continues to get more approachable, it's igniting a surge in demand, creating a powerful feedback loop that's leading to even more tooling development efforts.

How does CDC work?

There are three main types of CDC implementations:

Log-based captures changes from the existing database transaction logs and is the newest approach.
Query-based periodically queries the database to identify changes. This approach is simpler to set up but won't capture delete operations.
Trigger-based relies on database triggers to capture changes and write them to a change table. This approach reduces database performance since it requires multiple writes on each data change.

Log-based CDC is quickly becoming embraced as the de facto approach because it's the least invasive and most efficient. It involves a few steps:

Log creation: When a change is made to the database, a log entry is created that captures the details of the change.
Log consumption: The change data is processed and made available for use.
Data distribution: The data is distributed to the desired systems, such as a data warehouse, cache, or search index.

Let’s take a closer look at each step.

Log creation

Before a database such as PostgreSQL, MySQL, MongoDB, and SQLite stores data to disk, it first writes it to a transaction log.

Database transaction logs creation

This write-ahead logging technique allows writes to be more performant since the database can just do the lightweight log append operation before asynchronously making changes to the actual files and indexes. These transaction logs primarily serve as the database's source of truth to fall back on in case of a failure.

In Postgres, the volume of information recorded in the Write-Ahead Logs (WAL) can be adjusted. The wal_level setting offers three options, in ascending order of information logged: minimal, replica, and logical. CDC leverages these existing logs as the source of truth of all data changes, but requires a logical setting that enables changes to be read row-by-row, instead of by the physical disk blocks.

SHOW wal_level;
+-------------+
| wal_level   |
|-------------|
| logical     |
+-------------+

SQL command to check PostgreSQL's wal_level

The format and structure of the transaction logs depend on the implementation of the database type. For instance, MySQL generates a binlog, while MongoDB uses oplogs.

Log consumption

Fortunately, open-source projects like Debezium can now do most of the hard work of consuming entries from the transaction log and abstracting away the database implementation details with connectors that just produce generic abstract events.

https://blog.bytebytego.com/p/reddits-architecture-the-evolutionary

The Postgres connector relies on PostgreSQL’s replication protocol to access changes in real-time from the server’s transaction logs. It then transforms this information into a specific format, such as Protobuf or JSON, and sends it to an output destination. Each event gets structured as a key/value pair, where the key represents the primary key of the table, and the value includes details such as the before and after states of the change, along with additional metadata.

{
  "schema": { ... },
  "payload": {
    "before": {
      "id": 1,
      "first_name": "Mary",
      "last_name": "Samsonite",
    }
    "after": {
      "id": 1,
      "first_name": "Mary",
      "last_name": "Swanson",
    }
  },
  "source": {
    "connector": "postgresql",
    "name": "server1",
    "ts_ms": 1559033904863,
    "snapshot": true,
    "db": "postgres",
    "sequence": "[\\"24023119\\",\\"24023128\\"]",
    "schema": "public",
    "table": "customers",
    "txId": 555,
    "lsn": 24023128,
    "xmin": null,
  },
  "op": "c",
  "ts_ms": 1559033904863
}

Example Update event

Data distribution

CDC systems typically incorporate a message broker component to propagate the Debezium events. Apache Kafka stands out for this purpose because of a few advantages: scalability to handle large volumes of data, persistence of messages, guaranteed ordering per partition, and compaction capability, where multiple changes on the same record can optionally be easily rolled into one. From the message queue, client applications can then read events that correspond to the database tables of interest, and react to every row-level event they receive.

CDC distributed message queue system

Patterns

There are countless use cases where CDC systems are invaluable. You can use them to build notification systems instead of relying on callbacks, to invalidate caches, to update search indexes, to migrate data without downtime, to update vector embeddings, or to perform point-in-time data recovery, to name a few. I’ll highlight below some common CDC system patterns I've personally seen in production environments.

Microservice synchronization

In a microservice based architecture, each service often maintains its own standalone database. For instance, a user service might handle user data, while a friends service manages friend-related information. You might want to combine the data into a materialized view or replicate it to Elasticsearch to power queries such as “give me a user named Mary who has 2 friends”. CDC facilitates the decoupling of systems by enabling real-time data sharing across different components without direct message passing, thus supporting the scalability and flexibility required by these architectures.

Optimized decoupled local views

Audit Trails

CDC offers the most reliable and performant approach for building robust audit trails. The low-level data change events can be stitched with additional metadata to better record who made the change and why it was made. I'm one of the contributors to Bemi, an open-source tool that provides automatic audit trails, and we did this by creating libraries that inserted additional custom application-specific context (userID, API endpoint, etc.) in the database transaction logs using a similar technique to Google's Sqlcommenter. We stitch this information together in a CDC system and then store the enriched data in a queryable database.

Audit trails CDC architecture

Conclusion

As demand for CDC grows, understanding it is becoming increasingly essential for today's developers. And as developer tooling in this space continues to improve, the countless use cases powered by CDC will continue to get more accessible.

I've intentionally glossed over a lot of CDC details in this blog to keep it short. But I'd recommend checking out the Bemi source code to see how CDC systems that have handled billions of data changes actually work under the hood!

The Day Soft Deletes Caused Chaos

Arjun Lall — Tue, 12 Mar 2024 17:48:15 GMT

The worst mistake I made in my software engineering career was merging a seemingly harmless pull request 5 years ago.

TLDR: Soft deletes should not be used in production-grade systems—a lesson I learned the hard way when a severe mishap enabled the sale of the same concert seats to unlimited buyers.

Soft deletion is the easiest way to store deleted data and means just setting a deleted flag instead of performing a DELETE operation directly.

+--------------------------------------+------------+------------+---------+
| id                                   | last_name  | first_name | deleted |
+--------------------------------------+------------+------------+---------+
| 12778d88-41c8-4fc2-8be6-68c5d51c3893 | Samsonite  | Mary       | true    | 
+--------------------------------------+------------+------------+---------+

Deleted user using soft deletion

You add a new column on a table, perform an update when deleting, and filter out deleted data when querying.

SELECT *
FROM user
WHERE id = $1
    AND deleted IS NULL;

Querying with soft deletion

Although this approach is simple to set up, it can lead to dangerous consequences if it accidentally returns data not meant to be seen.

Critical Incident

I was working at an events ticketing company and I created a pull request that was similar to this:

class SeatClaim 
... 
- acts_as_paranoid
... 
+ def remove 
+   move_to_expired
+   destroy
+ end
...
end

app/models/seat_claim.rb

+ class MigrateDeletedSeatClaims < Migration
+  def self.up
+    expired_seat_claims = SeatClaim.where.not(deleted_at: nil)
+    expired_seat_claims.each(&:remove)
+  end 
+
+  def self.down 
... 
+ end

migrations/19700101000000_migrate_deleted_seat_claims.rb

In the seating reservation experience, you could claim a seat for 5 minutes during the checkout flow before a background job would delete the claim and release the seat to be bookable again. I was migrating from soft deleting seat claim’s to a new collection explicitly meant for storing the deleted rows.

Seating Reservation UX

The incident was caused because of this line:

- acts_as_paranoid

This removed the Paranoia library on the model that had abstracted away the soft deletion logic i.e. setting a deleted_at field to the current time when you delete a record. What wasn't top of mind for me was that it also automatically filtered out all soft-deleted records in the ORM.

Without the automatic exclusion of soft-deleted records and while the migration hadn't finished, the background worker began collecting claims that had already been "deleted" - inadvertently causing seats that were successfully paid for to be released and available for booking again!

I’ll never forget the sinking feeling and sense of dread when I realized what was happening.

This meant that the same seat at a Shawn Mendes concert was being sold multiple times over. Amplified by lots of seats, amplified by lots of events around the globe! Yeah it was bad.

To be fair, soft deletes weren't the lone culprit and there was a lot I should have done differently for this change, like breaking it out into multiple steps. Automated tests in the CI/CD pipeline should have caught this error, but it managed to slip through the cracks. Luckily there was a lot of observability in this area, so it was detected and remediated almost immediately. But the impact and fallout was still severe with hundreds of double bookings that had to be refunded, orders cancelled, apology emails sent to affected customers, and a late night postmortem written.

Don’t Soft Delete

The instinct to retain deleted data is understandable, even within the regulatory landscape of GDPR. Developers may need it for compliance, reporting, analytics, or just as a safety net – a chance to recover from accidental deletions or to examine a deleted record for troubleshooting. Imagine a customer accidentally deleting a crucial invoice, or a social media user deleting a comment that broke the rules. Keeping deleted data for a grace period can be valuable. However, the soft deletes approach creates more problems than it solves.

Complexity

Soft deletion infects everything and complicates queries. The application ORM layer usually automatically filters out "deleted" records, but this convenience can lead to oversight when constructing complex SQL queries manually. Like me, you might end up retrieving inaccurate results, potentially exposing sensitive data or making bad decisions based on incomplete information. Yes, creating a database View is safer, but it’s still extra complexity and an unneeded appendage.

Murphy's Law: anything that can go wrong will go wrong

Indexes, unique constraints, and foreign key relationships all also need to consider the "deleted" state, making them more intricate to create and maintain.

CREATE UNIQUE INDEX unique_active_users_email ON users (email)
WHERE deleted_at IS NULL;

Creating a unique index on the email field for active users

Even with adding partial indexes, soft deletes can lead to significant bloat, adversely affecting table size and performance. In high-volume environments, this can become a bigger issue and require performance tuning or data partitioning to maintain efficiency.

Data Integrity

Handling deletion in the application layer via soft deletes loses one of the benefits of the database, which tries to keep data valid for you.

ERROR: delete on table "users" violates foreign key constraint "orders_user_id_fkey" on table "orders"
DETAIL: Key (id)=(456) is still referenced from table "orders".

Database foreign key violation error

Enforcing referential integrity on your own can be error prone and adds significant development and maintenance overhead.

Alternatives to Soft Deletes

An alternative to soft deletes is to just archive the deleted data into history tables. It's still simple to do and removes the long term liability and maintenance burden of soft deletes. This can be done by inserting the deleted record to a separate table before deleting.

BEGIN;

-- Insert the SeatClaim record into SeatClaimHistory with deletion details
INSERT INTO SeatClaimHistory (id, user_id, seat_id, deleted_at)
SELECT id, user_id, seat_id, NOW() FROM SeatClaim WHERE id = $1

-- Delete the original SeatClaim record
DELETE FROM SeatClaim WHERE id = $1;

COMMIT;

Transaction to archive data before deleting

If you don't want to manually archive data all over your codebase, the best alternative is building an audit trail at the database layer. The Ultimate Guide to PostgreSQL Data Change Tracking outlines the different strategies for PostgreSQL. I'd also recommend checking out an open-source project I contribute to called Bemi, which aims to simplify this by plugging into a database and application (support for lots of different ORMs e.g. Bemi-rails) to provide a record of contextualized data changes automatically.

The Bottom Line

Steer clear of soft deletes. They might look like the easy fix for managing deleted data, but trust me—they're a ticking time bomb. I learned this the hard way years ago, and it's a mistake you don't want to repeat. Opt for history or audit tables instead. It's cleaner, safer, and will save you a world of trouble down the line.

The Ultimate Guide to PostgreSQL Data Change Tracking

Evgeny Li — Sat, 24 Feb 2024 05:29:43 GMT

PostgreSQL, one of the most popular databases, was named DBMS of the Year 2023 by DB-Engines Ranking and is used more than any other database among startups according to HN Hiring Trends.

PostgreSQL is the most popular database among startups

The SQL standard has included features related to temporal databases since 2011, which allow storing data changes over time rather than just the current data state. However, relational databases don’t completely follow the standards. In the case of PostgreSQL, it doesn’t support these features, even though there has been a submitted patch with some discussions.

There are PostgreSQL extensions like periods and temporal_tables that add support for temporal tables. Unfortunately, cloud providers such as AWS, Azure, and GCP don’t allow running custom C extensions with managed databases.

Let’s explore five alternative methods of data change tracking in PostgreSQL available to us in 2024.

Triggers and Audit Table

A PostgreSQL trigger with an audit table

PostgreSQL allows adding triggers with custom procedural SQL code performed on row changes with INSERT, UPDATE, and DELETE queries. The official PostgreSQL wiki describes a generic audit trigger function. Let’s have a quick look at a simplified example.

First, create a table called logged_actions in a separate schema called audit:

CREATE schema audit;

CREATE TABLE audit.logged_actions (
  schema_name TEXT NOT NULL,
  table_name TEXT NOT NULL,
  user_name TEXT,
  action_tstamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT current_timestamp,
  action TEXT NOT NULL CHECK (action IN ('I','D','U')),
  original_data TEXT,
  new_data TEXT,
  query TEXT
);

Next, create a function to insert audit records and establish a trigger on a table you wish to track, such as my_table:

CREATE OR REPLACE FUNCTION audit.if_modified_func() RETURNS TRIGGER AS $body$
BEGIN
  IF (TG_OP = 'UPDATE') THEN
    INSERT INTO audit.logged_actions (schema_name,table_name,user_name,action,original_data,new_data,query)
    VALUES (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),ROW(OLD.*),ROW(NEW.*),current_query());
    RETURN NEW;
  elsif (TG_OP = 'DELETE') THEN
    INSERT INTO audit.logged_actions (schema_name,table_name,user_name,action,original_data,query)
    VALUES (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),ROW(OLD.*),current_query());
    RETURN OLD;
  elsif (TG_OP = 'INSERT') THEN
    INSERT INTO audit.logged_actions (schema_name,table_name,user_name,action,new_data,query)
    VALUES (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),ROW(NEW.*),current_query());
    RETURN NEW;
  END IF;
END;
$body$
LANGUAGE plpgsql;

CREATE TRIGGER my_table_if_modified_trigger
AFTER INSERT OR UPDATE OR DELETE ON my_table
FOR EACH ROW EXECUTE PROCEDURE if_modified_func();

Once it’s done, row changes made in my_table will create records in audit.logged_actions:

INSERT INTO my_table(x,y) VALUES (1, 2);
SELECT * FROM audit.logged_actions;

If you want to further improve this solution by using JSONB columns instead of TEXT, ignoring changes in certain columns, pausing auditing a table, and so on, check out the SQL example in this audit-trigger repo and its forks.

Another alternative is the temporal_tables implementation written by using triggers. The main difference is that it stores records in a separate table with a time range during which a version was valid, not just an initial timestamp when a change was recorded. This makes it easier to perform time travel queries by selecting records that were valid at a specific point in time.

Downsides

Performance. Triggers add performance overhead by inserting additional records synchronously on every INSERT, UPDATE, and DELETE operation.
Security. Anyone with superuser access can modify the triggers and make unnoticed data changes. It is also recommended to make sure that records in the audit table cannot be modified or removed.
Maintenance. Managing complex triggers across many constantly changing tables can become cumbersome. Making a small mistake in an SQL script can break queries or data change tracking functionality.

Triggers and Notify/Listen

A PostgreSQL trigger with Notify

This approach is similar to the previous one but instead of writing data changes in the audit table directly, we pass them through a pub/sub mechanism through a trigger to another system dedicated to reading and storing these data changes:

CREATE OR REPLACE FUNCTION if_modified_func() RETURNS TRIGGER AS $body$
BEGIN
  IF (TG_OP = 'UPDATE') THEN
    PEFORM pg_notify('data_changes', json_build_object(
      'schema_name', TG_TABLE_SCHEMA::TEXT,
      'table_name', TG_TABLE_NAME::TEXT,
      'user_name', session_user::TEXT,
      'action', substring(TG_OP,1,1),
      'original_data', jsonb_build(OLD),
      'new_data', jsonb_build(NEW)
    )::TEXT);
    RETURN NEW;
  elsif (TG_OP = 'DELETE') THEN
    PEFORM pg_notify('data_changes', json_build_object(
      'schema_name', TG_TABLE_SCHEMA::TEXT,
      'table_name', TG_TABLE_NAME::TEXT,
      'user_name', session_user::TEXT,
      'action', substring(TG_OP,1,1),
      'original_data', jsonb_build(OLD)
    )::TEXT);
    RETURN OLD;
  elsif (TG_OP = 'INSERT') THEN
    PEFORM pg_notify('data_changes', json_build_object(
      'schema_name', TG_TABLE_SCHEMA::TEXT,
      'table_name', TG_TABLE_NAME::TEXT,
      'user_name', session_user::TEXT,
      'action', substring(TG_OP,1,1),
      'new_data', jsonb_build(NEW)
    )::TEXT);
    RETURN NEW;
  END IF;
END;
$body$
LANGUAGE plpgsql;

CREATE TRIGGER my_table_if_modified_trigger
AFTER INSERT OR UPDATE OR DELETE ON my_table
FOR EACH ROW EXECUTE PROCEDURE if_modified_func();

Now it’s possible to run a separate process running as a worker that listens to messages containing data changes and stores them separately:

LISTEN data_changes;

Downsides

“At most once” delivery. Listen/notify notifications are not persisted meaning if a listener disconnects, it may miss updates that happened before it reconnected again.
Payload size limit. Listen/notify messages have a maximum payload size of 8000 bytes by default. For larger payloads, it is recommended to store them in the DB audit table and send only references of the records.
Debugging. Troubleshooting issues related to triggers and listen/notify in a production environment can be challenging due to their asynchronous and distributed nature.

Application-Level Tracking

Application-level tracking with a PostgreSQL audit table

If you have control over the codebase that connects and makes data changes in a PostgreSQL database, then one of the following options is also available to you:

Manually record all data changes when issuing INSERT, UPDATE, and DELETE queries
Use existing open-source libraries that integrate with popular ORMs

For example, there is paper_trail for Ruby on Rails with ActiveRecord and django-simple-history for Django. At a high level, they use callbacks or middlewares to insert additional records into an audit table. Here is a simplified example written in Ruby:

class User < ApplicationRecord
  after_commit :track_data_changes

  private

  def track_data_changes
    AuditRecord.create!(auditable: self, changes: changes)
  end
end

On the application level, Event Sourcing can also be implemented with an append-only log as the source of truth. But it’s a separate, big, and exciting topic that deserves a separate blog post.

Downsides

Reliability. Application-level data change tracking is not as accurate as database-level change tracking. For example, data changes made outside an app will not be tracked, developers may accidentally skip callbacks, or there could be data inconsistencies if a query changing the data has succeeded but a query inserting an audit record failed.
Performance. Manually capturing changes and inserting them in the database via callbacks leads to both runtime application and database overhead.
Scalability. These audit tables are usually stored in the same database and can quickly become unmanageable, which can require separating the storage, implementing declarative partitioning, and continuous archiving.

Change Data Capture

Change Data Capture (CDC) is a pattern of identifying and capturing changes made to data in a database and sending those changes to a downstream system. Most often it is used for ETL to send data to a data warehouse for analytical purposes.

There are multiple approaches to implementing CDC. One of them, which doesn’t intersect with what we have already discussed, is a log-based CDC. With PostgreSQL, it is possible to connect to the Write-Ahead Log (WAL) that is used for data durability, recovery, and replication to other instances.

CDC with PostgreSQL logical replication

PostgreSQL supports two types of replications: physical replication and logical replication. The latter allows decoding WAL changes on a row level and filtering them out, for example, by table name. This is exactly what we need to implement data change tracking with CDC.

Here are the basic steps necessary for retrieving data changes by using logical replication:

1. Set wal_level to logical in postgresql.conf and restart the database.

2. Create a publication like a “pub/sub channel” for receiving data changes:

CREATE PUBLICATION my_publication FOR ALL TABLES;

3. Create a logical replication slot like a “cursor position” in the WAL:

SELECT * FROM pg_create_logical_replication_slot('my_replication_slot', 'wal2json');

4. Fetch the latest unread changes:

SELECT * FROM pg_logical_slot_get_changes('my_replication_slot', NULL, NULL);

To implement log-based CDC with PostgreSQL, I would recommend using the existing open-source solutions. The most popular one is Debezium.

Downsides

Limited context. PostgreSQL WAL contains only low-level information about row changes and doesn’t include information about an SQL query that triggered the change, information about a user, or any application-specific context.
Complexity. Implementing CDC adds a lot of system complexity. This involves running a server that connects to PostgreSQL as a replica, consumes data changes, and stores them somewhere.
Tuning. Running it in a production environment may require a deeper understanding of PostgreSQL internals and properly configuring the system. For example, periodically flushing the position for a replication slot to reclaim WAL disk space.

Integrated Change Data Capture

Integrated CDC with application context

To overcome the challenge of limited information about data changes stored in the WAL, we can use a clever approach of passing additional context to the WAL directly.

Here is a simple example of passing additional context on row changes:

CREATE OR REPLACE FUNCTION if_modified_func() RETURNS TRIGGER AS $body$
BEGIN
  PERFORM pg_logical_emit_message(true, 'my_message', 'ADDITIONAL_CONTEXT');

  IF (TG_OP = 'DELETE') THEN
    RETURN OLD;
  ELSE
    RETURN NEW;
  END IF;
END;
$body$
LANGUAGE plpgsql;

CREATE TRIGGER my_table_if_modified_trigger
AFTER INSERT OR UPDATE OR DELETE ON my_table
FOR EACH ROW EXECUTE PROCEDURE if_modified_func();

Notice the pg_logical_emit_message function that was added to PostgreSQL as an internal function for plugins. It allows namespacing and emitting messages that will be stored in the WAL. Reading these messages became possible with the standard logical decoding plugin pgoutput since PostgreSQL v14.

There is an open-source project called Bemi which allows tracking not only low-level data changes but also reading any custom context with CDC and stitching everything together. Full disclaimer, I’m one of the core contributors.

For example, it can integrate with popular ORMs and adapters to pass application-specific context with all data changes:

import { setContext } from "@bemi-db/prisma";
import express, { Request } from "express";

const app = express();

app.use(
  // Customizable context
  setContext((req: Request) => ({
    userId: req.user?.id,
    endpoint: req.url,
    params: req.body,
  }))
);

Downsides

Complexity and tuning related to implementing CDC.

If you need a ready-to-use cloud solution that can be integrated and connected to PostgreSQL in a few minutes, check out bemi.io.

Conclusion

PostgreSQL data change tracking approach comparison

If you need basic data change tracking, triggers with an audit table are a great initial solution.
Triggers with listen/notify are a good option for simple testing in a development environment.
If you value application-specific context (information about a user, API endpoint, etc.) over reliability, you can use application-level tracking.
Change Data Capture is a good option if you prioritize reliability and scalability as a unified solution that can be reused, for example, across many databases.
Finally, integrated Change Data Capture is your best bet if you need a robust data change tracking system that can also be integrated into your application. Go with bemi.io if you need a cloud-managed solution.

From Black Box to Open Source: Embracing Transparency

Arjun Lall — Fri, 09 Feb 2024 12:00:00 GMT

Today, we’re thrilled to announce that we’re open-sourcing Bemi! ? This is a fundamentally different approach to company building and we'll explain why this is the right decision for us.

For context — Bemi is a platform to automatically track contextualized PostgreSQL data changes and allows devs to leverage real-time data in their applications.

Building trust

At the heart of our decision is the desire to build trust. We're committed to eliminating data black boxes by providing a direct line of sight into the inner workings of our platform.

Users sometimes express concerns about access to specific data. To address this, we can now easily point to the code and affirm that unless a table or column is explicitly specified, it remains unseen. This tangible proof ensures users that their data is handled as intended and instills confidence in our cloud offering.

The long tail

Transparency comes with a greater surface area for feedback, making open sourcing a key approach for expanding our software's functionality. It invites a diverse range of perspectives, ensuring compatibility with different systems, and uncovers fresh applications—especially important for more generic infrastructure or database software.

Open source acts as a catalyst, guiding us towards a broader range of capabilities that meet the current diverse needs of developers. An example of this is the increasing popularity of PostgreSQL and MySQL over proprietary Oracle and MSSQL database incumbents.

https://db-engines.com/en/ranking_osvsc

Giving back to the community

We’re built on top of open source giants like Debezium and NATS. Open sourcing is our way of reciprocating the support we've received and giving back to the developer community.

At Bemi, we’re developer obsessed and we want to give the best possible developer experience we can. This means no vendor lock-in and ensuring our libraries are easily accessible. This is our contribution to nurturing collaboration within the developer community. Who knows, maybe one day there'll be tools built on top of what we've built.

https://xkcd.com/2347/

Exploring paths in open source

Companies embrace open source for various other reasons. Some, like Supabase, use it as a key differentiator, positioning themselves as the open-source Firebase alternative. Others, like CommaAI, use it to ignite diverse applications and innovations through encouraged repository forking.

There’s also clearly merit to other approaches to company building as well, evident in the success on each side, such as Gitlab vs. Github or Meta AI vs. OpenAI. Open sourcing isn't a one-size-fits-all strategy, but for us, it emboldens our company vision and goals.

Looking forward

We want to keep focusing on building the best products with our users, and not in isolation. We’re looking forward to the start of our open source journey ?, check out and star our GitHub to stay in the loop on updates!

Reducing Event Sourcing Complexity to Boost Product Velocity

Arjun Lall — Tue, 30 Jan 2024 12:00:00 GMT

I won’t go into great detail about why event sourcing is hard. Generally, it represents a significant paradigm shift from the way typical CRUD-like applications are built and introduces high technical complexity. Especially for startups, opting for this architecture comes with a big cost since it slows down how fast developers are able to ship.

Achieving benefits without slowing down

At the core of event sourcing is the event log - a record of immutable facts that document every change to an application's state. Why does this matter? Because sometimes, knowing just the current app state isn't enough; we want to know how we got there.

A pragmatic approach to getting the advantages of event sourcing is by recording an event log after data changes have already taken place. So you build your application just like developers are used to, but with a bit of extra functionality added around write operations. This way, you get the best of both worlds – an append-only log of state changes without sacrificing product velocity! This functionality can be added within the application or at the database level.

Application-level tracking

Writing some application code to track data changes is the simplest approach, but comes with some drawbacks. Common libraries like paper_trail and django-simple-history use callbacks for additional inserts during write operations. Apart from introducing runtime performance overhead, this approach compromises reliability since updates made outside the app stack aren't captured.

Database-level tracking

Tracking data history at the database layer is the most reliable. In PostgreSQL, this can be done with PGAudit, Audit Triggers, or using a pattern called Change Data Capture (CDC).

PGAudit: Sends detailed audit logs to the standard PostgreSQL output logs, but doesn't record events to a table.

Audit Triggers: Records changes to an audit log table, but runs synchronously in a transaction, impacting the primary DB instance's performance.

CDC: Recommended for scalability, asynchronously captures data changes by plugging into Postgres Write-Ahead Logs (WAL).

Although CDC is the generally preferred option, it still has drawbacks — it's the least simple to implement and lacks application context (where, who, how) behind a change. You can check out these architecture docs to see how we overcame these challenges at Bemi.

Subscribe to stay posted about the next blog, where I'll explain our architecture in greater detail!

Why Your Startup Needs a Reliable Source of Truth for Customer Activity

Arjun Lall — Fri, 19 Jan 2024 12:00:00 GMT

In the fast-paced world of operationally heavy startups, investing in a comprehensive source of truth for all customer activity can yield unparalleled returns. Imagine saving valuable engineering hours, empowering your operations team, and preventing customer churn – it's all within reach.

Why Do You Need This?

Consider Joe from Customer Success faced with a customer inquiry about a delayed shipment. Without a historical understanding of customer activity, Joe struggles to determine whether it's a bug on the platform or an account configuration change that the customer made. Joe pings engineering to investigate. An engineer tries to understand the story of what happened by piecing together information from various logs and database records, and relays that information to Joe, who then relays it to the customer.

This isn't just a hypothetical situation. In a recent conversation with a CTO, they recounted an incident where a customer wrongly blamed their platform for a failed email campaign, resulting in a multimillion-dollar deal lost for the customer. The CTO, personally diving into the investigation, discovered that their platform wasn't the culprit and had sent everything correctly. Instances like these are widespread and highlight the tangible impact of a lack of tooling in this area – not just in hours saved but also in valuable customer relationships safeguarded.

The Solution: Data Change Observability

Building a system that is able to reliably store and query data changes is key. Here are some options:

Open Source Libraries

Using a gem like paper_trail in Rails or django-simple-history in Django is the easiest approach and should cover most basic use cases. This is ideal for simplicity but may lack reliability and performance. Since they’re installed on the application layer, you'd miss out on recording edge cases like updates made via direct SQL queries. There's some runtime performance overhead since the libraries make extra database inserts in callbacks. This shouldn't be a problem except for larger operations, where even the cost and performance of the table size increases can start to become a concern.

Event Sourcing

Building an event sourced system is a comprehensive solution but is a significant paradigm shift from the way CRUD-like applications are typically built. If an application isn’t built with this architecture from the start, it can mean a large rewrite. This is likely not practical for most businesses.

Financial Ledger

For fintech’s, there’s also the option of a financial ledger. This can be built in-house to track payments and account balances, or there are countless ledger-as-a-service offerings that can be used.

Direct Integration with Database Logs

Some companies opt for direct integration with a database e.g. PostgreSQL’s Write-Ahead Logs for reliability, capturing everything via CDC at the database level. However, the records would lack application context like the 'where' (API endpoint, worker, etc.), 'who' (user, cron job, etc.), and 'how' behind a change.

Hybrid Approach - Bemi

Bemi takes a hybrid approach, integrating with both the database and application layers. While Bemi's architecture might seem complex, it ensures zero performance penalty, 100% reliability, and an enhanced understanding of each change. It's designed to be extremely simple for a user; however, we even go as far as to claim “full data history enabled in under a minute”.

User Experience

Building a user-friendly interface is another consideration. Most use cases would be covered by an internal dashboard showcasing each customer’s activity logs with some basic filtering functionality for Joe to be able to drill down on the relevant entity that he’s trying to troubleshoot. Bemi goes a step further, leveraging AI to transform complex data changes into human-readable logs, making it more accessible even to non-technical users like Joe.

The degree to which this is a problem and the ideal solution vary among companies. Please share how you’ve solved this in the past! if you’re dealing with this currently, feel free to schedule a chat, and I can share some additional tips along the way.

By building a reliable source of truth for customer activity, you're not just saving hours troubleshooting – you're future-proofing your operations and setting the stage for sustainable growth.