Frankenstein tables

The horror of bloated schemas and premature partitioning
Also known as monster tables, out-of-control tables, or God tables (the ones that try to hold everything because it’s easier and joins are for the weak). I prefer frankenstein tables. These beasts start innocently enough, a few extra columns here, some quick fixes there, but before you know it, they’ve grown into unmaintainable nightmares that devour performance, sanity, and developer souls.

Introduction

The chainsaw approach to database surgery
I’ve seen it too many times, a database table grows large, queries slow down, and the solution is to hastily split it horizontally into table_part1 and table_part2. To make matters worse, teams sometimes shard these across multiple servers, turning a simple SELECT into a distributed quest for data. It’s like using a chainsaw for surgery, effective in extreme cases, but usually resulting in a bloody, chaotic mess.
Horizontal partitioning (splitting by rows) should be a last resort, not the first panic button. Instead, start by questioning the table’s design. Wide tables with hundreds of columns are red flags, they bloat row sizes, wreck cache efficiency, and force full table scans even for basic queries.

My personal horror story

The bank’s 400-column beast
Let me share what I experienced firsthand. At the bank where I worked, we rented a banking application from a third-party vendor. The good news? All maintenance was outsourced, no worries for us. The bad news? We often waited months for implementations of our needs or wishes, and we had to blindly trust the vendor’s programmers.

Trust can be a serious problem. A very good programmer can still create a monster database, and they did. One table had ballooned to nearly 400 columns. That’s massive, but it could still work if designed thoughtfully. Much of this growth stemmed from government regulations requiring endless new fields for compliance. Adding data to an existing table makes sense on paper, but a bit more upfront planning and thinking could save mountains of trouble later.
Then came the disaster: the vendor decided to overhaul the application’s structure, forcing them to split this giant table into two smaller ones. It sounded like a great idea at first, smaller tables, right? But tables don’t get that big by accident; the data is interconnected and needed together. Splitting meant constant JOINs to reunite the pieces, complicating queries and tanking performance.

Here’s the kicker, the split wasn’t even for an application improvement. I never figured out the real reason, but I suspect it was just to end up with a smaller table for some arbitrary metric. It was the classic wrong-way problem-solving, my table is big, let’s split it in two! No deeper analysis, no alternatives considered.
And to link the new tables? They introduced an extra primary key. The original table already had one, but now there was a new linking key for quick jumps between table1 and table2. Logically, it was still one big table, just disguised as two. What happens next? The tables start growing again. Hey, we have two small tables now, let’s store some extra stuff in them!

FIX IT

If anyone from that bank is reading this, fix it. Stop working from home, get back to the office, grab a whiteboard, and prepare a real solution. March over to the vendor’s programmers and tell them what to do, not ask. It’s time for the IT manager to step up and slay this Frankenstein.

The common tragedy

Panic, split, chaos
People spot a big table → panic sets in → they split it horizontally → chaos ensues. Queries that were simple now require UNIONs or cross-shard operations. Backups double in complexity. Deployments become minefields. And if you’re sharding across servers? Good luck with data consistency and failover

Better first steps: go vertical, not horizontal
Before splitting rows (horizontal partitioning), ask, is this table even properly designed? Start with vertical partitioning, splitting by columns into new tables while keeping the same rows. This reduces bloat without disrupting your data’s natural relationships.
Here’s a simple checklist to evaluate before resorting to splits:

Normalized to at least 3NF? Eliminate obvious redundancy without going overboard.
Extracted rarely-used or large columns? Big text fields, blobs, or audit logs don’t belong in your core table if they’re seldom queried.
Proper indexes and query analysis? Use EXPLAIN or similar tools to spot inefficiencies, covering indexes can work wonders.
Archiving strategy for old data? Move historical rows to an archive table; automate it with triggers or jobs.
Monitoring showing actual hardware limits? Track growth rates and I/O bottlenecks before assuming size is the issue.

My go-to solution, look for groups of fields that can be combined into a new table. In our banking monster, we had around 50 fields for documentation alone: types, numbers, issue dates, expiries, scans, etc. Pull those out into a dedicated customer_documents table. Boom: Instant size reduction, and queries on core customer data speed up.
Another easy win: tax-related fields. Banks deal with tons of tax data, rates, exemptions, filings. Bundle those 50-ish fields into a customer_taxes table. Linked via foreign keys, of course.

Last resort

Horizontal partitioning (with caution)
If you’ve exhausted vertical options, optimized everything, and still face billions of rows with hardware bottlenecks, then consider horizontal partitioning (splitting by rows, like by date, region, or ID range). Many modern DBMSes (as PostgreSQL, MySQL) support declarative partitioning, which is less painful than manual splits.
But beware: even after splitting, human nature kicks in. We have two small tables now, let’s add some extra stuff! Without discipline, your solution becomes tomorrow’s Frankenstein.

Conclusion

Slay the monster early
Frankenstein tables are born from laziness and grow through neglect. Don’t let panic drive you to horizontal splits. Start vertical, refactor smartly, and monitor relentlessly. Your future self (and queries) will thank you.
If you’ve got your own horror stories, share them in the comments, misery loves company. And remember: a well-designed schema is worth a thousand hacks.

This brings us to the end of my post on Frankenstein tables.

Thank you for taking the time to read my post on Frankenstein tables.

I hope you found it enjoyable and insightful.
Stay tuned for more content that is coming soon.
If you like what you read, please consider sharing it with others who might find it helpful.

Contact me

If you have any questions or want to contact me, please drop me an email at info@safecomputer.org

Stay updated with my monthly newsletter

Subscribe to Safe Computer’s monthly newsletter in the right sidebar for tips on job applications, cybersecurity, and more! Get summaries of my latest posts, like this Database Crimes, straight to your inbox. Join now at safecomputer.org!

Disclaimer

All tips and methods mentioned in this blog are tested on Windows 11. Please note that results may vary on other operating systems or versions of Windows. Adapt the instructions accordingly.

Copyright

© 2025 Henny Staas/safecomputer.org. Unauthorized use and/or duplication of this material without express and written permission from this site’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Henny Staas/safecomputer.org with appropriate and specific direction to the original content.

Leave a Reply

Your email address will not be published. Required fields are marked *