Skip to content

Latest commit

 

History

History
263 lines (227 loc) · 89.8 KB

README.md

File metadata and controls

263 lines (227 loc) · 89.8 KB

Tencent HBASE setup

Description

This Terraform repository is used to bootstrap a Solana HBase cluster for storing blockchain data. As with most Hadoop ecosystems, this stack consists of several key components as outlined below:

Configuration Management:
Zookeeper Cluster - Utilized by the Hadoop ecosystem for storing configuration details.
JournalNode Cluster - Employed by the HDFS filesystem to store journal data.
By default, Zookeeper and JournalNode are co-located on the same nodes. The cluster is initially configured with three nodes.

Management Nodes:
HDFS NameNodes - Responsible for managing the HDFS DataNodes, these are crucial to the Hadoop ecosystem. The setup uses an active-passive NameNode configuration that supports failover in case the primary node fails. High availability (HA) is managed through Zookeeper.
HBase Master Nodes - These nodes manage the HBase setup, including overseeing the region servers.
Both the HDFS NameNodes and HBase Master Nodes are co-located on the same nodes. By default, two management nodes are provisioned.

Worker Nodes:
HDFS DataNodes - These nodes store the actual data within the filesystem and are managed by the HDFS NameNodes.
HBase Region Servers - These servers store HBase data and are managed by the HBase Master Nodes.
The pool of worker nodes can be scaled as needed by increasing the number of servers and then reapplying the Terraform configurations and associated commands.

Provisioning

The following sections provide a detailed guide on the steps required to bootstrap a new HBase cluster. Several tasks must be completed to ensure the successful deployment of the entire cluster. Initially, the infrastructure must be provisioned in the cloud. Following that, a series of Tencent Automation Tools (TAT) commands must be executed for each category of machines, as outlined in the subsequent sections.

the Infrastructure

In this section, the infrastructure is provisioned, creating a default total of seven machines within Tencent Cloud. You can adjust various parameters for each machine category as detailed in the Inputs section.
Before running Terraform, ensure that the providers.tf file contains the necessary credentials for your Tencent Cloud account. To bootstrap the new infrastructure, execute the following commands in sequence:

terraform init
terraform plan
terraform apply

After executing the Terraform code, the three machine categories will be created. You can view these machines in the Tencent Cloud console by navigating to the Cloud Virtual Machine service.
From this section onward, the TAT commands must be used to configure Zookeeper, HDFS and Hbase.

the Zookeeper cluster

The Zookeeper cluster, which also runs the JournalNode service, is configured by executing the following TAT commands in order:

1-zookeeper-setup: Execute this command first, selecting only the Zookeeper nodes (e.g. zookeeper-0, zookeeper-1, and zookeeper-2). This command configures the Zookeeper cluster.
2-zookeeper-journal-setup: After successfully executing the previous command, run this command on the same Zookeeper nodes. It configures the JournalNode cluster on these machines.
At this stage, both Zookeeper and JournalNodes should be up and running.

the HDFS Cluster

The HDFS cluster, composed of NameNodes and DataNodes, is configured by executing the following TAT commands in sequence:

1-hadoop-hdfs-setup-common-nodes: This command applies a common configuration to all nodes in the HDFS setup. Ensure you select all relevant nodes (e.g. hbase-management-0, hbase-management-1, hbase-worker-0, and hbase-worker-1) to prepare them for the HDFS setup.
2-management-hdfs-setup-namenodes: Since the HDFS cluster's NameNodes are configured for high availability (HA), you must first run this command on the primary NameNode (e.g. hbase-management-0). After it completes successfully, run the same command on the secondary NameNode (e.g. hbase-management-1).
1-worker-hdfs-setup-datanodes: Once the NameNodes are operational, configure the HDFS DataNodes by executing this command and selecting the worker nodes (e.g. hbase-worker-0 and hbase-worker-1).

If all the above commands have been executed in this exact order the HDFS cluster should be up and running. The state of the cluster can be verified by ssh-ing into one of the NameNodes machines, e.g. hbase-management-0 and executing the following commands:

su hadoop
cd /usr/local/hadoop/hadoop-3.4.0/
./bin/hdfs dfsadmin -report

The command should output something as follows which indicates the HDFS cluster is healthy:

Configured Capacity: 105552699392 (98.30 GB)
Present Capacity: 81349865472 (75.76 GB)
DFS Remaining: 81349816320 (75.76 GB)
DFS Used: 49152 (48 KB)
DFS Used%: 0.00%
Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
Erasure Coded Block Groups: 
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0

the Hbase cluster

The HBase cluster is the final component of the Hadoop ecosystem that must be configured. Execute the following TAT commands in sequence:

1-hbase-setup-common-nodes: This command applies common HBase configurations to all nodes. Be sure to select all nodes (e.g. hbase-management-0, hbase-management-1, hbase-worker-0, and hbase-worker-1).
2-hbase-setup-master: This command configures the HBase master nodes. Execute this command on the first primary management node (e.g. hbase-management-0) and wait for it to complete.
3-hbase-setup-region-servers: Run this command to configure the HBase region servers, ensuring you select only the worker nodes (e.g. hbase-worker-0 and hbase-worker-1).

To verify that the Hbase cluster is up and running execute the following commands:

su hbase
cd /usr/local/hbase/hbase-2.6.0/
./bin/hbase shell
status

You should see the two worker nodes added in the cluster and the active master healthy.

Requirements

Name Version
terraform >=1.5
external >=2.3.1
tencentcloud >= 1.81.32

Providers

Name Version
tencentcloud 1.81.106

Modules

Name Source Version
acls ./modules/vpc_acl n/a

Resources

Name Type
tencentcloud_eip.nat_eip resource
tencentcloud_instance.hbase_management_node resource
tencentcloud_instance.hbase_workers_node resource
tencentcloud_instance.zookeeper_node resource
tencentcloud_nat_gateway.nat resource
tencentcloud_route_table.route_table resource
tencentcloud_route_table_entry.route_entry resource
tencentcloud_security_group.hbase_management_sg resource
tencentcloud_security_group.hbase_workers_sg resource
tencentcloud_security_group.zookeeper_sg resource
tencentcloud_security_group_rule_set.hbase_management_sg_rule resource
tencentcloud_security_group_rule_set.hbase_workers_sg_rule resource
tencentcloud_security_group_rule_set.zookeeper_sg_rule resource
tencentcloud_subnet.subnet resource
tencentcloud_tat_command.hbase-setup-common-nodes resource
tencentcloud_tat_command.hbase-setup-master resource
tencentcloud_tat_command.hbase-setup-region-servers resource
tencentcloud_tat_command.hdfs-setup-common-nodes resource
tencentcloud_tat_command.hdfs-setup-namenodes resource
tencentcloud_tat_command.hdfs-setup-workernodes resource
tencentcloud_tat_command.qjournal-setup resource
tencentcloud_tat_command.zookeeper-setup resource
tencentcloud_vpc.vpc resource
tencentcloud_images.hbase_management_image data source
tencentcloud_images.hbase_workers_image data source
tencentcloud_images.zookeeper_image data source

Inputs

Name Description Type Default Required
create_route_table Enable the creation of the route table bool true no
create_vpc Enable the creation of the VPC bool true no
enable_nat_gateway Enable the creation of the NAT gateway bool true no
hadoop_data_dir HADOOP data directory string "/var/lib/hadoop" no
hadoop_home HADOOP home directory string "/usr/local/hadoop" no
hadoop_version HADOOP version to use in the infrastructure string "3.4.0" no
hbase_home HBASE home directory string "/usr/local/hbase" no
hbase_version HBASE version to use in the infrastructure string "2.6.0" no
java_home Java home directory string "/usr/lib/jvm/java-1.8.0" no
management_availability_zone n/a string "The instance availability zone" no
management_data_disk_encrypt Enable data disk encryption bool false no
management_data_disk_size The instace data disk size number 50 no
management_data_disk_type The instace data disk type string "CLOUD_BSSD" no
management_force_delete Indicate whether to force delete the instance bool false no
management_image_id The Hbase managemen node image id, if this is provided then it will override other image parameters below string "img-eb30mz89" no
management_image_name_regex The Hbase managemen node image id, if this is provided then it will override other image parameters below string "Solana" no
management_image_type The Hbase managemen node image type, this parameter and management_image_name_regex are used only if image_id is set to empty value list(string)
[
"PUBLIC_IMAGE"
]
no
management_instance_charge_type The charge type of instance string "POSTPAID_BY_HOUR" no
management_instance_charge_type_prepaid_period The tenancy (time unit is month) of the prepaid instance number 1 no
management_instance_charge_type_prepaid_renew_flag Auto renewal flag string "NOTIFY_AND_MANUAL_RENEW" no
management_instance_count The number of Hbase management nodes to bootstrap number 2 no
management_instance_name The instace management name prefix string "hbase-management" no
management_instance_project The project the instance belongs to number 0 no
management_instance_tags Specify one or more tags for the instance map(string)
{
"network": "tencent",
"type": "management"
}
no
management_instance_type The instace type string "SA5.MEDIUM4" no
management_subnet_id The subnet id for the instance string "" no
management_system_disk_size The instace system disk size number 50 no
management_system_disk_type The instace system disk type string "CLOUD_BSSD" no
nat_gateway_bandwidth bandwidth of NAT Gateway number 100 no
nat_gateway_concurrent bandwidth of NAT Gateway number 1000000 no
nat_gateway_public_ips The list of public IPs associated with the NAT gateway list(string) [] no
nat_gateway_tags Specify one or more tags for the NAT gateway map(string)
{
"network": "tencent",
"type": "hbase"
}
no
route_entries n/a
list(object({
destination_cidr_block = string
next_type = string
next_hub = string
}))
[
{
"destination_cidr_block": "0.0.0.0/0",
"next_hub": "0",
"next_type": "NAT"
}
]
no
route_table_id Specify a route table id if you want to reuse an existing route table string "" no
route_table_tags Specify one or more tags for the route table map(string)
{
"network": "tencent",
"type": "hbase"
}
no
stack Specify a stack name that would be prefixed to each resource created with this module string "tencent-" no
subnet_cidrs Specify one or more subnets to create within the VPC, either use this parameter or subnet_ids but not both
list(object({
name = string
cidr_block = string
is_multicast = string
availability_zone = string
}))
[
{
"availability_zone": "eu-frankfurt-1",
"cidr_block": "172.16.1.0/24",
"is_multicast": true,
"name": "hbase_subnet_1"
},
{
"availability_zone": "eu-frankfurt-2",
"cidr_block": "172.16.2.0/24",
"is_multicast": true,
"name": "hbase_subnet_2"
}
]
no
subnet_ids Specify existing subnet ids without creating them using this module, if this is specified then subnet_cidrs must NOT be configured list(string) [] no
subnets_tags Specify one or more tags for the subnets map(string)
{
"network": "tencent",
"type": "hbase"
}
no
vpc_acl_tags Specify one or more tags for the VPC ACLs map(string)
{
"network": "tencent",
"type": "hbase"
}
no
vpc_acls Specify one or more ACLs to attach to the subnets
list(object({
name = string
ingress = list(string)
egress = list(string)

}))
[
{
"egress": [
"ACCEPT#0.0.0.0/0#ALL#ALL"
],
"ingress": [
"ACCEPT#0.0.0.0/0#ALL#ALL"
],
"name": "egress-acl"
}
]
no
vpc_cidr The CIDR block that will be used by the VPC string "172.16.0.0/16" no
vpc_dns_servers Specify one or more DNS servers to be used within the VPC set(string) [] no
vpc_id Specify a VPC id if you want to deploy the Hbase nodes within a existing VPC string "" no
vpc_is_multicast Enable or disable VPC multicast bool true no
vpc_name Tencent VPC name string "tencent_hbase" no
vpc_tags Specify one or more tags for the VPC map(string)
{
"network": "tencent",
"type": "hbase"
}
no
workers_availability_zone n/a string "The instance availability zone" no
workers_data_disk_encrypt Enable workers disk encryption bool false no
workers_data_disk_size The instace workers disk size number 50 no
workers_data_disk_type The instace workers disk type string "CLOUD_BSSD" no
workers_force_delete Indicate whether to force delete the instance bool false no
workers_image_id The Hbase worker node image id, if this is provided then it will override other image parameters below string "img-eb30mz89" no
workers_image_name_regex The Hbase worker node image id, if this is provided then it will override other image parameters below string "Solana" no
workers_image_type The Hbase worker node image type, this parameter and image_name_regex are used only if image_id is set to empty value list(string)
[
"PUBLIC_IMAGE"
]
no
workers_instance_charge_type The charge type of instance string "POSTPAID_BY_HOUR" no
workers_instance_charge_type_prepaid_period The tenancy (time unit is month) of the prepaid instance number 1 no
workers_instance_charge_type_prepaid_renew_flag Auto renewal flag string "NOTIFY_AND_MANUAL_RENEW" no
workers_instance_count The number of Hbase worker nodes to bootstrap number 2 no
workers_instance_name The instace name prefix string "hbase-worker" no
workers_instance_project The project the instance belongs to number 0 no
workers_instance_tags Specify one or more tags for the instance map(string)
{
"network": "tencent",
"type": "worker"
}
no
workers_instance_type The instace type string "SA5.MEDIUM4" no
workers_subnet_id The subnet id for the instance string "" no
workers_system_disk_size The instace system disk size number 50 no
workers_system_disk_type The instace system disk type string "CLOUD_BSSD" no
zookeeper_availability_zone n/a string "The instance availability zone" no
zookeeper_data_dir Zookeeper data directory string "/var/lib/zookeeper" no
zookeeper_data_disk_encrypt Enable data disk encryption bool false no
zookeeper_data_disk_size The instace data disk size number 50 no
zookeeper_data_disk_type The instace data disk type string "CLOUD_BSSD" no
zookeeper_force_delete Indicate whether to force delete the instance bool false no
zookeeper_home Zookeeper home directory string "/usr/local/zookeeper" no
zookeeper_image_id The Zookeeper node image id, if this is provided then it will override other image parameters below string "img-eb30mz89" no
zookeeper_image_name_regex The Zookeeper node image id, if this is provided then it will override other image parameters below string "Solana" no
zookeeper_image_type The Zookeeper node image type, this parameter and zookeeper_image_name_regex are used only if image_id is set to empty value list(string)
[
"PUBLIC_IMAGE"
]
no
zookeeper_instance_charge_type The charge type of instance string "POSTPAID_BY_HOUR" no
zookeeper_instance_charge_type_prepaid_period The tenancy (time unit is month) of the prepaid instance number 1 no
zookeeper_instance_charge_type_prepaid_renew_flag Auto renewal flag string "NOTIFY_AND_MANUAL_RENEW" no
zookeeper_instance_count The number of Zookeeper nodes to bootstrap number 3 no
zookeeper_instance_name The instace zookeeper name prefix string "zookeeper" no
zookeeper_instance_project The project the instance belongs to number 0 no
zookeeper_instance_tags Specify one or more tags for the instance map(string)
{
"network": "tencent",
"type": "zookeeper"
}
no
zookeeper_instance_type The instace type string "SA5.MEDIUM4" no
zookeeper_java_home Java home directory string "/usr/lib/jvm/java-1.8.0" no
zookeeper_subnet_id The subnet id for the instance string "" no
zookeeper_system_disk_size The instace system disk size number 50 no
zookeeper_system_disk_type The instace system disk type string "CLOUD_BSSD" no
zookeeper_version Zookeeper version to use in the infrastructure string "3.7.2" no

Outputs

No outputs.