This Terraform repository is used to bootstrap a Solana HBase cluster for storing blockchain data. As with most Hadoop ecosystems, this stack consists of several key components as outlined below:
Configuration Management:
Zookeeper Cluster - Utilized by the Hadoop ecosystem for storing configuration details.
JournalNode Cluster - Employed by the HDFS filesystem to store journal data.
By default, Zookeeper and JournalNode are co-located on the same nodes. The cluster is initially configured with three nodes.
Management Nodes:
HDFS NameNodes - Responsible for managing the HDFS DataNodes, these are crucial to the Hadoop ecosystem. The setup uses an active-passive NameNode configuration that supports failover in case the primary node fails. High availability (HA) is managed through Zookeeper.
HBase Master Nodes - These nodes manage the HBase setup, including overseeing the region servers.
Both the HDFS NameNodes and HBase Master Nodes are co-located on the same nodes. By default, two management nodes are provisioned.
Worker Nodes:
HDFS DataNodes - These nodes store the actual data within the filesystem and are managed by the HDFS NameNodes.
HBase Region Servers - These servers store HBase data and are managed by the HBase Master Nodes.
The pool of worker nodes can be scaled as needed by increasing the number of servers and then reapplying the Terraform configurations and associated commands.
The following sections provide a detailed guide on the steps required to bootstrap a new HBase cluster. Several tasks must be completed to ensure the successful deployment of the entire cluster. Initially, the infrastructure must be provisioned in the cloud. Following that, a series of Tencent Automation Tools (TAT) commands must be executed for each category of machines, as outlined in the subsequent sections.
In this section, the infrastructure is provisioned, creating a default total of seven machines within Tencent Cloud. You can adjust various parameters for each machine category as detailed in the Inputs section.
Before running Terraform, ensure that the providers.tf file contains the necessary credentials for your Tencent Cloud account. To bootstrap the new infrastructure, execute the following commands in sequence:
terraform init
terraform plan
terraform apply
After executing the Terraform code, the three machine categories will be created. You can view these machines in the Tencent Cloud console by navigating to the Cloud Virtual Machine service.
From this section onward, the TAT commands must be used to configure Zookeeper, HDFS and Hbase.
The Zookeeper cluster, which also runs the JournalNode service, is configured by executing the following TAT commands in order:
1-zookeeper-setup: Execute this command first, selecting only the Zookeeper nodes (e.g. zookeeper-0, zookeeper-1, and zookeeper-2). This command configures the Zookeeper cluster.
2-zookeeper-journal-setup: After successfully executing the previous command, run this command on the same Zookeeper nodes. It configures the JournalNode cluster on these machines.
At this stage, both Zookeeper and JournalNodes should be up and running.
The HDFS cluster, composed of NameNodes and DataNodes, is configured by executing the following TAT commands in sequence:
1-hadoop-hdfs-setup-common-nodes: This command applies a common configuration to all nodes in the HDFS setup. Ensure you select all relevant nodes (e.g. hbase-management-0, hbase-management-1, hbase-worker-0, and hbase-worker-1) to prepare them for the HDFS setup.
2-management-hdfs-setup-namenodes: Since the HDFS cluster's NameNodes are configured for high availability (HA), you must first run this command on the primary NameNode (e.g. hbase-management-0). After it completes successfully, run the same command on the secondary NameNode (e.g. hbase-management-1).
1-worker-hdfs-setup-datanodes: Once the NameNodes are operational, configure the HDFS DataNodes by executing this command and selecting the worker nodes (e.g. hbase-worker-0 and hbase-worker-1).
If all the above commands have been executed in this exact order the HDFS cluster should be up and running. The state of the cluster can be verified by ssh-ing into one of the NameNodes machines, e.g. hbase-management-0 and executing the following commands:
su hadoop
cd /usr/local/hadoop/hadoop-3.4.0/
./bin/hdfs dfsadmin -report
The command should output something as follows which indicates the HDFS cluster is healthy:
Configured Capacity: 105552699392 (98.30 GB)
Present Capacity: 81349865472 (75.76 GB)
DFS Remaining: 81349816320 (75.76 GB)
DFS Used: 49152 (48 KB)
DFS Used%: 0.00%
Replicated Blocks:
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
The HBase cluster is the final component of the Hadoop ecosystem that must be configured. Execute the following TAT commands in sequence:
1-hbase-setup-common-nodes: This command applies common HBase configurations to all nodes. Be sure to select all nodes (e.g. hbase-management-0, hbase-management-1, hbase-worker-0, and hbase-worker-1).
2-hbase-setup-master: This command configures the HBase master nodes. Execute this command on the first primary management node (e.g. hbase-management-0) and wait for it to complete.
3-hbase-setup-region-servers: Run this command to configure the HBase region servers, ensuring you select only the worker nodes (e.g. hbase-worker-0 and hbase-worker-1).
To verify that the Hbase cluster is up and running execute the following commands:
su hbase
cd /usr/local/hbase/hbase-2.6.0/
./bin/hbase shell
status
You should see the two worker nodes added in the cluster and the active master healthy.
Name | Version |
---|---|
terraform | >=1.5 |
external | >=2.3.1 |
tencentcloud | >= 1.81.32 |
Name | Version |
---|---|
tencentcloud | 1.81.106 |
Name | Source | Version |
---|---|---|
acls | ./modules/vpc_acl | n/a |
Name | Description | Type | Default | Required |
---|---|---|---|---|
create_route_table | Enable the creation of the route table | bool |
true |
no |
create_vpc | Enable the creation of the VPC | bool |
true |
no |
enable_nat_gateway | Enable the creation of the NAT gateway | bool |
true |
no |
hadoop_data_dir | HADOOP data directory | string |
"/var/lib/hadoop" |
no |
hadoop_home | HADOOP home directory | string |
"/usr/local/hadoop" |
no |
hadoop_version | HADOOP version to use in the infrastructure | string |
"3.4.0" |
no |
hbase_home | HBASE home directory | string |
"/usr/local/hbase" |
no |
hbase_version | HBASE version to use in the infrastructure | string |
"2.6.0" |
no |
java_home | Java home directory | string |
"/usr/lib/jvm/java-1.8.0" |
no |
management_availability_zone | n/a | string |
"The instance availability zone" |
no |
management_data_disk_encrypt | Enable data disk encryption | bool |
false |
no |
management_data_disk_size | The instace data disk size | number |
50 |
no |
management_data_disk_type | The instace data disk type | string |
"CLOUD_BSSD" |
no |
management_force_delete | Indicate whether to force delete the instance | bool |
false |
no |
management_image_id | The Hbase managemen node image id, if this is provided then it will override other image parameters below | string |
"img-eb30mz89" |
no |
management_image_name_regex | The Hbase managemen node image id, if this is provided then it will override other image parameters below | string |
"Solana" |
no |
management_image_type | The Hbase managemen node image type, this parameter and management_image_name_regex are used only if image_id is set to empty value | list(string) |
[ |
no |
management_instance_charge_type | The charge type of instance | string |
"POSTPAID_BY_HOUR" |
no |
management_instance_charge_type_prepaid_period | The tenancy (time unit is month) of the prepaid instance | number |
1 |
no |
management_instance_charge_type_prepaid_renew_flag | Auto renewal flag | string |
"NOTIFY_AND_MANUAL_RENEW" |
no |
management_instance_count | The number of Hbase management nodes to bootstrap | number |
2 |
no |
management_instance_name | The instace management name prefix | string |
"hbase-management" |
no |
management_instance_project | The project the instance belongs to | number |
0 |
no |
management_instance_tags | Specify one or more tags for the instance | map(string) |
{ |
no |
management_instance_type | The instace type | string |
"SA5.MEDIUM4" |
no |
management_subnet_id | The subnet id for the instance | string |
"" |
no |
management_system_disk_size | The instace system disk size | number |
50 |
no |
management_system_disk_type | The instace system disk type | string |
"CLOUD_BSSD" |
no |
nat_gateway_bandwidth | bandwidth of NAT Gateway | number |
100 |
no |
nat_gateway_concurrent | bandwidth of NAT Gateway | number |
1000000 |
no |
nat_gateway_public_ips | The list of public IPs associated with the NAT gateway | list(string) |
[] |
no |
nat_gateway_tags | Specify one or more tags for the NAT gateway | map(string) |
{ |
no |
route_entries | n/a | list(object({ |
[ |
no |
route_table_id | Specify a route table id if you want to reuse an existing route table | string |
"" |
no |
route_table_tags | Specify one or more tags for the route table | map(string) |
{ |
no |
stack | Specify a stack name that would be prefixed to each resource created with this module | string |
"tencent-" |
no |
subnet_cidrs | Specify one or more subnets to create within the VPC, either use this parameter or subnet_ids but not both | list(object({ |
[ |
no |
subnet_ids | Specify existing subnet ids without creating them using this module, if this is specified then subnet_cidrs must NOT be configured | list(string) |
[] |
no |
subnets_tags | Specify one or more tags for the subnets | map(string) |
{ |
no |
vpc_acl_tags | Specify one or more tags for the VPC ACLs | map(string) |
{ |
no |
vpc_acls | Specify one or more ACLs to attach to the subnets | list(object({ |
[ |
no |
vpc_cidr | The CIDR block that will be used by the VPC | string |
"172.16.0.0/16" |
no |
vpc_dns_servers | Specify one or more DNS servers to be used within the VPC | set(string) |
[] |
no |
vpc_id | Specify a VPC id if you want to deploy the Hbase nodes within a existing VPC | string |
"" |
no |
vpc_is_multicast | Enable or disable VPC multicast | bool |
true |
no |
vpc_name | Tencent VPC name | string |
"tencent_hbase" |
no |
vpc_tags | Specify one or more tags for the VPC | map(string) |
{ |
no |
workers_availability_zone | n/a | string |
"The instance availability zone" |
no |
workers_data_disk_encrypt | Enable workers disk encryption | bool |
false |
no |
workers_data_disk_size | The instace workers disk size | number |
50 |
no |
workers_data_disk_type | The instace workers disk type | string |
"CLOUD_BSSD" |
no |
workers_force_delete | Indicate whether to force delete the instance | bool |
false |
no |
workers_image_id | The Hbase worker node image id, if this is provided then it will override other image parameters below | string |
"img-eb30mz89" |
no |
workers_image_name_regex | The Hbase worker node image id, if this is provided then it will override other image parameters below | string |
"Solana" |
no |
workers_image_type | The Hbase worker node image type, this parameter and image_name_regex are used only if image_id is set to empty value | list(string) |
[ |
no |
workers_instance_charge_type | The charge type of instance | string |
"POSTPAID_BY_HOUR" |
no |
workers_instance_charge_type_prepaid_period | The tenancy (time unit is month) of the prepaid instance | number |
1 |
no |
workers_instance_charge_type_prepaid_renew_flag | Auto renewal flag | string |
"NOTIFY_AND_MANUAL_RENEW" |
no |
workers_instance_count | The number of Hbase worker nodes to bootstrap | number |
2 |
no |
workers_instance_name | The instace name prefix | string |
"hbase-worker" |
no |
workers_instance_project | The project the instance belongs to | number |
0 |
no |
workers_instance_tags | Specify one or more tags for the instance | map(string) |
{ |
no |
workers_instance_type | The instace type | string |
"SA5.MEDIUM4" |
no |
workers_subnet_id | The subnet id for the instance | string |
"" |
no |
workers_system_disk_size | The instace system disk size | number |
50 |
no |
workers_system_disk_type | The instace system disk type | string |
"CLOUD_BSSD" |
no |
zookeeper_availability_zone | n/a | string |
"The instance availability zone" |
no |
zookeeper_data_dir | Zookeeper data directory | string |
"/var/lib/zookeeper" |
no |
zookeeper_data_disk_encrypt | Enable data disk encryption | bool |
false |
no |
zookeeper_data_disk_size | The instace data disk size | number |
50 |
no |
zookeeper_data_disk_type | The instace data disk type | string |
"CLOUD_BSSD" |
no |
zookeeper_force_delete | Indicate whether to force delete the instance | bool |
false |
no |
zookeeper_home | Zookeeper home directory | string |
"/usr/local/zookeeper" |
no |
zookeeper_image_id | The Zookeeper node image id, if this is provided then it will override other image parameters below | string |
"img-eb30mz89" |
no |
zookeeper_image_name_regex | The Zookeeper node image id, if this is provided then it will override other image parameters below | string |
"Solana" |
no |
zookeeper_image_type | The Zookeeper node image type, this parameter and zookeeper_image_name_regex are used only if image_id is set to empty value | list(string) |
[ |
no |
zookeeper_instance_charge_type | The charge type of instance | string |
"POSTPAID_BY_HOUR" |
no |
zookeeper_instance_charge_type_prepaid_period | The tenancy (time unit is month) of the prepaid instance | number |
1 |
no |
zookeeper_instance_charge_type_prepaid_renew_flag | Auto renewal flag | string |
"NOTIFY_AND_MANUAL_RENEW" |
no |
zookeeper_instance_count | The number of Zookeeper nodes to bootstrap | number |
3 |
no |
zookeeper_instance_name | The instace zookeeper name prefix | string |
"zookeeper" |
no |
zookeeper_instance_project | The project the instance belongs to | number |
0 |
no |
zookeeper_instance_tags | Specify one or more tags for the instance | map(string) |
{ |
no |
zookeeper_instance_type | The instace type | string |
"SA5.MEDIUM4" |
no |
zookeeper_java_home | Java home directory | string |
"/usr/lib/jvm/java-1.8.0" |
no |
zookeeper_subnet_id | The subnet id for the instance | string |
"" |
no |
zookeeper_system_disk_size | The instace system disk size | number |
50 |
no |
zookeeper_system_disk_type | The instace system disk type | string |
"CLOUD_BSSD" |
no |
zookeeper_version | Zookeeper version to use in the infrastructure | string |
"3.7.2" |
no |
No outputs.