Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mysql - table partitioning vs "manual" table partitioning

Tags:

sql

php

mysql

I have the following choice:

I have a huge table (9999999999999 rows), let's call it tableHuge, and I would like to split it into multiple tables (to optimize queries). This table contains dates (days of the month), and most queries are made using a specified month as search key in select. This leads me to the following choices:

Choice one: Split the table into multiple tables, using a month as his tail (like lessHugeTable_01, lessHugeTable_02, etc.). Then I can take care in my app to access table that I need. The main downside is loosing the ability to join, in cases that includes more than one month (or join with an union... well.. complications).

Choice two: Use table partitioning.

Since I never used partitioning before (so I don't have knowledge to compare), I would like some advice on how to do it, pros and cons if possible (except obvious things like "if your manual partition table gets broken you loose only that data while in table part you loose whole data").

Thank you for your time.

like image 684
zozo Avatar asked Sep 17 '25 15:09

zozo


1 Answers

The answer here is really "depends".

More specifically it depends on the nature of your data, what accesses your data and how that data is accessed.

From the sounds of it you might be best off with a table partitioned by year and month. I am making wild assumptions here that you will need to access older data less frequently/never and hence will be able to archive it off to keep data volumes down in your main table (like I said "depends"!);

If your table is, and always will be, accessed by one application alone into which you can build logic to handle your 'tail' naming conventions then you might want to go down the multiple tables route.

Here is how I see pros and cons stacking up:

Multiple Tables Pros

  1. Smaller individual table if just selecting data for a single month
  2. Errr. I can only think of one actually

Multiple Tables Cons

  1. Difficulty in querying/updating multi-month datasets
  2. What happens if you get data from January in the February table? "But it will never happen!". Really? Really?!
  3. If multiple applications need to access these tables then they will all have to have your 'tail' naming convention logic in place i.e. lessHugeTable_02 has data from February in it.

and now partitioning:

Partitioning Tables Pros

  1. You're letting MySQL take care of your data sharding for you. So no "this month = this table" logic needed in your application
  2. No risk of January data getting into the February table
  3. Joining becomes easier since you have a single logical (if not physical) table
  4. If you are using MySQL 5.5 or newer then you can truncate partitions. Really handy for any housekeeping you might want to do

Partitioning Tables Cons

  1. Potentially you have a much much larger dataset to query. If you run a query that spans multiple partitions then it will probably take a while. Choose your partition key wisely!
  2. Probably more but I am running out of time and wild assumptions!

PS There is a good answer on some of the points here

like image 103
Tom Mac Avatar answered Sep 20 '25 05:09

Tom Mac