-
Notifications
You must be signed in to change notification settings - Fork 29
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
KB: add the article for potential risk with fstrim
- Also mentioned how to avoid this risk. Signed-off-by: Vicente Cheng <vicente.cheng@suse.com> Co-authored-by: Kiefer Chang <kiefer.chang@suse.com>
- Loading branch information
1 parent
27e6a70
commit 9b644d0
Showing
1 changed file
with
44 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
--- | ||
title: The potential risk with fstrim | ||
description: The potential risk with fstrim and how to avoid it | ||
slug: the_potential_risk_with_fstrim | ||
authors: | ||
- name: Vicente Cheng | ||
title: Senior Software Engineer | ||
url: https://github.com/Vicente-Cheng | ||
image_url: https://github.com/Vicente-Cheng.png | ||
tags: [harvester, rancher integration, longhorn, fstrim] | ||
hide_table_of_contents: false | ||
--- | ||
|
||
The `fstrim` is the common way to release the unused space of the filesystem. However, we encounter the known issue with `fstrim` on the Longhorn volume. This article shares the potential risk with `fstrim` and how to avoid it. | ||
|
||
The known issue is that executing the `fstrim` on the Longhorn volume may result in IOErrors if the volume is rebuilding. Related issue: (You can find more details in the issues) | ||
- https://github.com/harvester/harvester/issues/4739 | ||
- https://github.com/longhorn/longhorn/issues/7103 | ||
|
||
## The potential risk and affection with fstrim | ||
|
||
If you encounter the known issue on the above, that will result in the IOErrors. The IOErrors will cause the VM that uses this volume to be stuck. If the VM is critical, it will cause the application to be unavailable. For example, Harvester usually uses the Longhorn volume as the VM disk. After encountering this issue, the VM will flap in pause and running state until the volume rebuild is completed. | ||
|
||
That does not affect the data integrity, but it will cause some panic issues for users. It caused the VM to hang, and the application will be unavailable. Consider the guest Kubernetes cluster scenario. When the VM is unavailable, it means the etcd service is not available. If half of the etcd service is unavailable, the Kubernetes cluster will be unavailable. Meanwhile, any services running on this Kubernetes cluster will be unavailable. | ||
|
||
## How to avoid the potential risk | ||
|
||
The way to avoid the potential risk is to disable the `fstrim` in VMs. The `fstrim` is enabled by default on various modern Linux distributions. | ||
You can check the following items for the potential `fstrim`. | ||
|
||
:::note | ||
The following items are for VMs that use the Longhorn volume, so `fstrim` will cause the above issue. | ||
::: | ||
|
||
- Check the service `fstrim.timer`. You can **disable** it or **edit** the service file to make the `fstrim` does not execute almost simultaneously. | ||
|
||
Please check the following section and modify it to distribute the `fstrim` timing. | ||
``` | ||
[Timer] | ||
OnCalendar=weekly | ||
AccuracySec=1h | ||
Persistent=true | ||
RandomizedDelaySec=6000 | ||
``` |