Practices of using MySQL and DBPool

给公司写的文档草稿,稍后会写成中文的。

Summary

In large cluster environment, it is always challenge of manage hundreds MySQL databases.

  • Planed down time of database server maintenance
  • Scalability
  • Change of data structures

Ideally, these all handled by proper development process and you have enough software engineers to support.
But in real world, it is more urgent of solving problems.

Here is the 9 best practices to operate thousands of app servers and databases and we don’t have any “user impact” because of database maintenance.

Database design:

1 Global design

We design our database schema like a one-node-cluster, we can run all the service in one box or 1000 boxes.
It is transparent to software engineers.

2 Simple only

Use only basic MySQL features: table, primary key, index, replication.

Application design:

3 Use app server’s CPU.

App servers are scalable, but MySQL is a bottle neck, Use as much as application CPU, which means:
Do not query without index
Do not sort using database
Do not use any query causes temporary tables

4 Use a abstraction layer of tables

We use DBPool for a long time

Ops workflow:

5 Vertical partitioning

When you want to move a few tables to a different master.
a. Setup new master (B) as the slave of the old master (A)
b. Change the DBPool configuration, point master to B.
c. In the middle of the time, (B) will also receive some update requests from old client.
d. Make sure it is no any queries to tables on B
e. Stop the replication and optionally drop old tables on A.

6 Horizontal shading

When you want to distribute data of one table into more physic servers.
a. Estimate how many is needed, find a proper shading key. You should visit only one instance after shading.
b. A good shading number may be 10 or 100. It is human friendly when debug.
c. You don’t need to have 100 physic servers to deploy all tables, DBPool have the ability of route.
d. Use the same method to move tables to new master as showed in (5)

7 Change of data structures

a. We do add column only, no drop column.
b. Application level compatible is required. Make sure new code is working with both old and new data. (if impossible, see (8))
c. Make the changes
d. Update the application to use new column for new feature.

8 Data migration

This situation always involves a big change to the logic, you need to redesign the structure
a. Create a new master (B) of tables using (5)
b. Create new data structures on (B).
c. Create a trigger to update new structure when old data changed on (B).
d. Migrate your old data into the new structure, pay attention to (c) have already moved some recent data.
e. Create a new abstract instance in DBPool, for new structures.
f. Update the application use the new structure for reading.
g. In the same time, old client and new client have the same data for we have (c).
h. Update the application use the new structure for writing.
i. Stop the replication (a), trigger (c) and drop the old tables

9 Planned maintenance

a. DBPool is enough to move MySQL slave servers.
b. Use (5) to make a new master or promote one slave to master.

Free Website Hosting (PaaS)

自从AppFog的免费主机取消了,剩下免费账号也越来越慢。对比了各种云主机,OpenShift是目前最好的选择了。
Update 2015-Mar-06: Appfog的环境里有一个旧版的DBCP包,非常头疼。

OpenShift的免费账号功能亮点很多:

  • 支持3个512M节点
    • 可以建立2个App,1个MySQL;
    • 两个App可以AutoScaleUp
  • 可以ssh登录;
  • 负载均衡采用HAProxy,不计费;
  • 支持自定义域名;
  • 默认域名rhcloud.com支持SSL;
  • 填个信用卡注册了Bronze账号,控制好用量,也可以免费,但是可以支持自定义域名的SSL了。

试用了一下,性能比AppFog高不少,对个人网站来说,是奢侈配置。做开发的功能测试用,也很方便。