Skip to content

Commit

Permalink
Update instructions for Elasticsearch node replacements
Browse files Browse the repository at this point in the history
  • Loading branch information
krysal committed Jan 15, 2025
1 parent 912d6e1 commit 12e23a4
Showing 1 changed file with 17 additions and 4 deletions.
21 changes: 17 additions & 4 deletions documentation/meta/maintenance/elasticsearch_cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -354,6 +354,9 @@ In the steps below, `<env>` refers to a subfolder in the `environments/`
directory, and can be one of "dev" or "prod". This is different from
`<environment>` which is part of the ES module name and can be one of "staging"
or "production".
Consider pausing the `<environment>_elasticsearch_cluster_healthcheck` to avoid
getting redundant alerts during this process.
```

1. Find the instance in the AWS management console using the ID from the
Expand Down Expand Up @@ -381,9 +384,13 @@ or "production".
with this change in production. Apply this change. The plan should include:

- creation of a new Elasticsearch data node
- creation of new alarms and changes to existing alarms
- creation of new alarms and changes to existing alarms (for production)
- addition of the new instance's IP to the Route53 record

```bash
just tf <env> apply -target='module.staging-elasticsearch-8-8-2'
```

4. Wait for a new instance to be provisioned by Terraform.

Record the public IPv4 DNS and Terraform index of the new instance, they will
Expand All @@ -394,7 +401,7 @@ or "production".
pass the public IPv4 DNS step from step 3 to the `-l`/`--limit` flag.

```bash
just ansible/playbook <env> elasticsearch/sync_config.yml -e apply=true -l <public_ipv4_dns>`
just ansible/playbook <environment> elasticsearch/sync_config.yml -e apply=true -l <public_ipv4_dns>
```

```{note}
Expand All @@ -413,7 +420,7 @@ or "production".
`http://localhost:9220`.

```{tip}
Use the Elasticvue extension or app because the web interface cannot connect
Use the Elasticvue browser extension or app because the web interface cannot connect
to Elasticsearch due to CORS protection.
```

Expand All @@ -422,7 +429,9 @@ or "production".

```json
{
"transient.cluster.routing.allocation.exclude.name": "<private_ipv4_address>"
"transient": {
"cluster.routing.allocation.exclude.name": "<private_ipv4_address>"
}
}
```

Expand Down Expand Up @@ -458,6 +467,10 @@ or "production".
- subtraction of the retired instance's IP from the Route53 record
- destruction of extra alarms and changes to existing alarms
```bash
just tf <env> apply -target='module.staging-elasticsearch-8-8-2'
```
14. If the deletion of the individual alarms fails due to them being part of a
composite alarm, go into the AWS management console and remove the alarms
that are supposed to be deleted from any composite alarms that they are a
Expand Down

0 comments on commit 12e23a4

Please sign in to comment.