Teaching the Elephant to Swim: In-Place Analytics with Kafka and pg_duckdb
Thursday, June 25 at 14:20–15:10
-
I am a co-founder of Baremon with more than 25 years of experience in database services, data governance and cloud technologies. My background is rooted in hands-on database administration and ETL, and gradually evolved into leading and shaping data platforms as organisational complexity and scale increase.
Today, I combine deep technical expertise with a pragmatic, forward-looking approach. I focus on building robust, secure and reliable data ecosystems, where performance, stability and informed decision-making are business-critical.
At Baremon, I contribute to both the technical and strategic development of data services, working with platforms such as Oracle, AWS, Collibra and Kubernetes across different industries.
I place strong emphasis on mentoring and knowledge sharing. This includes leading trainings, proofs of concept and customer enablement initiatives that help teams operate and evolve their data platforms with confidence.
-
I am a Senior Consultant at Baremon, specialising in database services, data governance, and cloud transformation. My background is grounded in hands-on experience with database technologies such as Oracle and PostgreSQL, and is complemented by work with modern platforms including Dataiku and Yugabyte. I combine technical expertise with a pragmatic, business-focused approach. I work across Azure, AWS, and GCP environments, helping organisations modernise data platforms, optimise performance, and establish effective governance frameworks. My Collibra certifications enable structured, compliant, and scalable governance implementations. At Baremon, I help build secure, resilient, and future-ready data ecosystems where reliability and informed decision-making are essential. Fluent in multiple languages, I collaborate effectively in international environments to deliver measurable and sustainable outcomes.
Modern real-time fraud detection stacks have become an exercise in infrastructure sprawl. A typical architecture involves PostgreSQL for transactions, Kafka for events, a vector database for embeddings, a data warehouse for analytics, and complex ETL pipelines or event-sourcing frameworks to keep everything in sync. This 'six-box' architecture introduces six points of failure and a significant consistency challenge.
In this session, we demonstrate a 'PostgreSQL-First' architecture that reduces this complexity by 70% without sacrificing performance or correctness. We argue that in high-stakes environments, where a single error can cost millions, PostgreSQL must remain the sovereign Source of Truth, while Kafka should serve strictly as a distributed commit log – not as the authority on state.
We will explore:
- The Write Path: Why we chose the Transactional Outbox pattern over standard CDC to maintain semantic control and domain-driven event boundaries.
- The Recovery Contract: Using the Consumer Inbox pattern and ON CONFLICT DO NOTHING in PostgreSQL to ensure idempotent processing and safe history replays.
- In-Place Intelligence: Replacing dedicated vector databases with pgvector for behavioural fingerprinting and using pg_duckdb on read replicas for zero-ETL, sub-second analytics.
- PostgreSQL 18 Internals: How native UUIDv7 eliminates B-tree fragmentation in high-throughput streams, and how the new AIO subsystem removes structural performance ceilings on modern NVMe storage.
The talk concludes with a live demonstration using the Kaggle Credit Card Fraud dataset, where we will deliberately 'break' the system and show how the database-anchored architecture ensures deterministic recovery even under extreme pressure.
Takeaways:
Attendees will leave with a blueprint for building lean, explainable, and operationally 'boring' systems that leverage the latest PostgreSQL 18 features to replace fragmented best-of-breed stacks.