教你快速搭建mongodb分片集群

环境准备

阿里云申请3台ECS,如下:

序号 内网IP OS
A 172.16.190.78 CentOS 7.6
B 172.16.242.36 CentOS 7.6
C 172.16.190.77 CentOS 7.6

部署规划

Mongodb 分片集群节点分为三种类型:

  • mongos 节点:处理客户端的连接,扮演存取路由器的角色,将请求分发到正确的数据节点上,对客户端屏蔽分片的概念;
  • configsvr 节点:配置服务,保存数据结构的元数据,比如每个分片上的数据范围,数据块列表等;
  • shard 节点:数据存储节点,每个分片存储部分全体数据,所有主分片的数据组成全体数据,每个主分片有其数据相同的副分片。

3台服务器组成 mongodb 集群,包含3个 mongos 节点,3个配置服务节点,数据分为3个分片,每个分片包含1个主节点与2个副节点。下表是具体规划,包含服务端口:

172.16.190.78 172.16.242.36 172.16.190.77
mongos (27017) mongos (27017) mongos (27017)
configsvr (28017) configsvr (28017) configsvr (28017)
shard1 primary (29017) shard1 secondary (29017) shard1 secondary (29017)
shard2 primary (29018) shard2 secondary (29018) shard2 secondary (29018)
shard3 primary (29019) shard3 secondary (29019) shard3 secondary (29019)

部署

安装

在3台服务器上都执行下面命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ cd ~/software
# 下载mongodb安装包
$ wget https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-rhel70-4.0.4.tgz
$ tar -xf mongodb-linux-x86_64-rhel70-4.0.4.tgz -C ./
$ mv mongodb-linux-x86_64-rhel70-4.0.4/ ../mongodb

$ cd ../mongodb
# 创建相关目录
$ mkdir -pv {config,log,data/configsvr,data/shard1,data/shard2,,data/shard3}

# 配置环境变量
$ vim /etc/profile
# 将一下内容添加到末尾,保存
export MONGODB_HOME=/opt/mongodb
export PATH=${MONGODB_HOME}/bin:$PATH

$ source /etc/profile

配置 configsvr 节点

在3台服务器上都添加 configsvr.conf 配置文件,并都启动 configsvr 节点,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$ vim ~/mongodb/config/configsvr.conf
# 内容如下
systemLog:
destination: file
logAppend: true
path: "/home/dolphin/mongodb/log/mongodb.log"
storage:
dbPath: "/home/dolphin/mongodb/data/configsvr"
journal:
enabled: true
processManagement:
fork: true
net:
port: 28017
bindIp: 0.0.0.0
replication:
replSetName: mustone
sharding:
clusterRole: configsvr

# 启动节点
$ numactl --interleave=all mongod -f /home/dolphin/mongodb/config/configsvr.conf
# 输出一下内容代表成功
about to fork child process, waiting until server is ready for connections.
forked process: 9689
child process started successfully, parent exiting

登录任意一台服务器,进行副本集的初始化

1
2
3
4
5
6
7
8
9
10
11
$ mongo --port 28017
>config = { _id:"mustone",
configsvr: true,
members:[
{_id:0,host:"172.16.190.78:28017"},
{_id:1,host:"172.16.242.36:28017"},
{_id:2,host:"172.16.190.77:28017"}
]
}
>rs.initiate(config)
{ "ok" : 1 }

初始化成功后可以通过 rs.status() 查看状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
mustone:PRIMARY> rs.status()
{
"set" : "mustone",
"date" : ISODate("2020-06-29T11:47:14.166Z"),
"myState" : 1,
"term" : NumberLong(1),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"configsvr" : true,
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1593431231, 2),
"t" : NumberLong(1)
},
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1593431231, 2),
"t" : NumberLong(1)
},
"appliedOpTime" : {
"ts" : Timestamp(1593431231, 2),
"t" : NumberLong(1)
},
"durableOpTime" : {
"ts" : Timestamp(1593431231, 2),
"t" : NumberLong(1)
}
},
"lastStableCheckpointTimestamp" : Timestamp(1593431222, 3),
"members" : [
{
"_id" : 0,
"name" : "172.16.190.78:28017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 602612,
"optime" : {
"ts" : Timestamp(1593431231, 2),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1593431231, 2),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2020-06-29T11:47:11Z"),
"optimeDurableDate" : ISODate("2020-06-29T11:47:11Z"),
"lastHeartbeat" : ISODate("2020-06-29T11:47:12.164Z"),
"lastHeartbeatRecv" : ISODate("2020-06-29T11:47:14.037Z"),
"pingMs" : NumberLong(2),
"lastHeartbeatMessage" : "",
"syncingTo" : "172.16.242.36:28017",
"syncSourceHost" : "172.16.242.36:28017",
"syncSourceId" : 1,
"infoMessage" : "",
"configVersion" : 1
},
{
"_id" : 1,
"name" : "172.16.242.36:28017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 602723,
"optime" : {
"ts" : Timestamp(1593431231, 2),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2020-06-29T11:47:11Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1592828632, 1),
"electionDate" : ISODate("2020-06-22T12:23:52Z"),
"configVersion" : 1,
"self" : true,
"lastHeartbeatMessage" : ""
},
{
"_id" : 2,
"name" : "172.16.190.77:28017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 602612,
"optime" : {
"ts" : Timestamp(1593431231, 2),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1593431231, 2),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2020-06-29T11:47:11Z"),
"optimeDurableDate" : ISODate("2020-06-29T11:47:11Z"),
"lastHeartbeat" : ISODate("2020-06-29T11:47:13.064Z"),
"lastHeartbeatRecv" : ISODate("2020-06-29T11:47:13.836Z"),
"pingMs" : NumberLong(2),
"lastHeartbeatMessage" : "",
"syncingTo" : "172.16.242.36:28017",
"syncSourceHost" : "172.16.242.36:28017",
"syncSourceId" : 1,
"infoMessage" : "",
"configVersion" : 1
}
],
"ok" : 1,
"operationTime" : Timestamp(1593431231, 2),
"$gleStats" : {
"lastOpTime" : Timestamp(0, 0),
"electionId" : ObjectId("7fffffff0000000000000001")
},
"lastCommittedOpTime" : Timestamp(1593431231, 2),
"$clusterTime" : {
"clusterTime" : Timestamp(1593431231, 2),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}

# 从中可以看到 172.16.242.36:28017 为主节点,其他为副节点

配置 shard 节点

shard 节点本质上与 configsvr 节点是同样的,都是副本集的设置,只是部分配置不同,重点是副本集名称角色名称不同。

在3台服务器上都执行下面操作:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$ vim ~/mongodb/config/shard1.conf
# 内容如下
systemLog:
destination: file
logAppend: true
path: /home/dolphin/mongodb/log/shard1.log
storage:
dbPath: /home/dolphin/mongodb/data/shard1
journal:
enabled: true
processManagement:
fork: true
net:
port: 29017
bindIp: 0.0.0.0
replication:
replSetName: shard1
sharding:
clusterRole: shardsvr

# 启动节点
$ numactl --interleave=all mongod -f /home/dolphin/mongodb/config/shard1.conf
# 输出一下内容代表成功
about to fork child process, waiting until server is ready for connections.
forked process: 9905
child process started successfully, parent exiting

登录任意一台服务器,进行副本集的初始化

1
2
3
4
5
6
7
8
9
10
$ mongo --port 29017
>config = { _id:"shard1",
members:[
{_id:0,host:"172.16.190.78:29017"},
{_id:1,host:"172.16.242.36:29017"},
{_id:2,host:"172.16.190.77:29017"}
]
}
>rs.initiate(config)
{ "ok" : 1 }

可以通过 rs.status() 查看副本集状态。

同理配置shard2、shard3,并进行副本集初始化。

配置 mongos 节点

在3台服务器上配置 mongos 配置文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
vim  ~/mongodb/config/mongos.conf

#配置文件内容如下:
systemLog:
destination: file
logAppend: true
path: /home/dolphin/mongodb/log/mongos.log
processManagement:
fork: true
net:
port: 27017
bindIp: 0.0.0.0
sharding:
configDB: mustone/172.16.190.78:28017,172.16.242.36:28017,172.16.190.77:28017

启动 mongos 节点:

1
$ numactl --interleave=all  mongos -f /home/dolphin/mongodb/config/mongos.conf

启动分片

上述操作只是启动了4个副本集(1个configsvr,3个shard),如何将它们联系在一起,组成 mongodb 的分片机制呢?还需要启用分片。

在其中一台服务器登录 mongos 节点:

1
2
3
4
5
6
$ mongo --port 27017
mongos> use admin
#串联路由服务器与分配副本集
mongos> sh.addShard("shard1/172.18.1.23:29017,172.18.1.24:29017,172.18.1.25:29017")
mongos> sh.addShard("shard2/172.18.1.23:29018,172.18.1.24:29018,172.18.1.25:29018")
mongos> sh.addShard("shard3/172.18.1.23:29019,172.18.1.24:29019,172.18.1.25:29019")

通过 sh.status() 查看状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5ef0a2dad259276fa1ffdd0d")
}
shards:
{ "_id" : "shard1", "host" : "shard1/172.16.190.77:29017,172.16.190.78:29017,172.16.242.36:29017", "state" : 1 }
{ "_id" : "shard2", "host" : "shard2/172.16.190.77:29018,172.16.190.78:29018,172.16.242.36:29018", "state" : 1 }
{ "_id" : "shard3", "host" : "shard3/172.16.190.77:29019,172.16.190.78:29019,172.16.242.36:29019", "state" : 1 }
active mongoses:
"4.0.4" : 3
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
config.system.sessions
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
shard1 1
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(1, 0)

验证

总结下上面的步骤:

  1. 下载与解压部署包
  2. 创建 configsvr 配置文件,并启动 configsvr 节点,初始化 configsvr 副本集
  3. 创建 shard1 配置文件,并启动 shard1 节点,初始化 shard1 副本集
  4. 创建 shard2 配置文件,并启动 shard2 节点,初始化 shard2 副本集
  5. 创建 shard3 配置文件,并启动 shard3 节点,初始化 shard3 副本集
  6. 创建 mongos 配置文件,启动 mongos 节点。
  7. 启动分片。

完成后在每台服务器上查看进程,如下

1
2
3
4
5
6
$ ps aux | grep mongo
root 9689 0.4 15.8 1771976 285276 ? Sl Jun22 47:31 mongod -f /home/dolphin/mongodb/config/configsvr.conf
root 9905 0.3 5.6 1562944 102020 ? Sl Jun22 35:52 mongod -f /home/dolphin/mongodb/config/shard1.conf
root 9937 0.3 4.8 1564096 88136 ? Sl Jun22 36:14 mongod -f /home/dolphin/mongodb/config/shard2.conf
root 9968 0.3 5.3 1566636 95600 ? Sl Jun22 36:04 mongod -f /home/dolphin/mongodb/config/shard3.conf
root 10250 0.0 1.2 259324 23164 ? Sl Jun22 9:33 mongos -f /home/dolphin/mongodb/config/mongos.conf

可以看到 mongos 为路由进程,configsvr 与 shard 都是 mongod 进程,本质都是 mongodb 的副本集,只是存的数据内容不同。configsvr 存的是集群元数据,shard 存的是业务数据。

虽然我们搭建的是分片集群,实际上只是提供了分片的能力。当用户往数据库中写数据时,默认是不分片存取的,而是全部存在某个shard副本集中。如果要分片存取,需要用户显式的指定。

如以下操作指定数据库 mustone 中的 myuser 集合进行分片存取,并指定片键为 _id。

1
2
3
4
mongos> use admin
mongos> db.runCommand( { enablesharding :"mustone"});
#指定数据库里需要分片的集合和片键
mongos> db.runCommand( { shardcollection : "mustone.myuser",key : {_id: "hashed"} } )

配置后可以往该集合中写入多条数据,并登陆每个分片查看其中数据量,此处略。

再次通过 sh.status() 查看集群状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
mongos> sh.status()
--- Sharding Status ---
......(省略部分)
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
config.system.sessions
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
shard1 1
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(1, 0)
{ "_id" : "mustone", "primary" : "shard2", "partitioned" : true, "version" : { "uuid" : UUID("63c8edbe-45b8-4f07-b1a4-c42f201b42a4"), "lastMod" : 1 } }
mustone.myuser
shard key: { "_id" : "hashed" }
unique: false
balancing: true
chunks:
shard1 2
shard2 2
shard3 2
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6148914691236517204") } on : shard1 Timestamp(1, 0)
{ "_id" : NumberLong("-6148914691236517204") } -->> { "_id" : NumberLong("-3074457345618258602") } on : shard1 Timestamp(1, 1)
{ "_id" : NumberLong("-3074457345618258602") } -->> { "_id" : NumberLong(0) } on : shard2 Timestamp(1, 2)
{ "_id" : NumberLong(0) } -->> { "_id" : NumberLong("3074457345618258602") } on : shard2 Timestamp(1, 3)
{ "_id" : NumberLong("3074457345618258602") } -->> { "_id" : NumberLong("6148914691236517204") } on : shard3 Timestamp(1, 4)
{ "_id" : NumberLong("6148914691236517204") } -->> { "_id" : { "$maxKey" : 1 } } on : shard3 Timestamp(1, 5)

可以看到数据库 mustone 的 partitioned 值为 true,并且详细列出了 myuser 集合的分片策略。


不明白 mongodb 的分片原理?没关系,等我。