Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vm-under-test: Adjust disk image to realtime #35

Merged
merged 6 commits into from
Nov 27, 2023

Conversation

RamLavi
Copy link
Collaborator

@RamLavi RamLavi commented Nov 16, 2023

This PR is introducing the required configurations for realtime.

@RamLavi RamLavi force-pushed the adjust_disak_image_to_rt branch 3 times, most recently from ea9ccbf to 61a5cbc Compare November 19, 2023 08:10
@RamLavi
Copy link
Collaborator Author

RamLavi commented Nov 19, 2023

looks like the isolcpus=managed_irq,domain,1 is outputted only after I did a manual soft-reboot:

# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-348.rt7.130.el8.x86_64 root=UUID=a9332d7d-1762-41cd-a702-6b2cc556c248 ro console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=UUID=d4ac9572-e828-4c87-9b76-e59c9fa6e426 console=ttyS0,115200 skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 isolcpus=managed_irq,domain,1-2 intel_pstate=disable nosoftlockup nohz=on nohz_full=1-2 rcu_nocbs=1-2 irqaffinity=0 iommu=pt intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=1

@matosatti Is that necessary to reboot to image before running it?

@RamLavi RamLavi force-pushed the adjust_disak_image_to_rt branch 3 times, most recently from 596e51b to a5a6124 Compare November 20, 2023 09:45
@RamLavi RamLavi requested a review from orelmisan November 20, 2023 09:46
@orelmisan
Copy link
Member

@RamLavi could you please share the results from manually running oslat with this container disk image?

Copy link
Member

@orelmisan orelmisan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @RamLavi
Please see the inline comments.

@@ -21,3 +21,5 @@ set -e

systemctl disable NetworkManager-wait-online
systemctl disable sshd
systemctl disable irqbalance
systemctl stop irqbalance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don`t think there is a point in stopping this service, because when the VM will later boot - it will be disabled because of the previous command.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I run the VMI on the sshd - the service is active - so I'm not so sure.

vms/vm-under-test/scripts/customize-vm Show resolved Hide resolved
vms/vm-under-test/scripts/customize-vm Outdated Show resolved Hide resolved
@@ -27,4 +27,4 @@ systemctl stop irqbalance
dnf --enablerepo=rt install -y tuned-profiles-realtime
dnf --enablerepo=nfv install -y tuned-profiles-nfv-guest

grubby --args="default_hugepagesz=1GB hugepagesz=1G hugepages=1" --update-kernel=$(grubby --default-kernel)
grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=1" --update-kernel=$(grubby --default-kernel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it required?

Copy link
Collaborator Author

@RamLavi RamLavi Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is how Nini configured it. it is supposed to make thing more optimized on intel nodes that support this..

These packages are needed in order to run oslat.

Signed-off-by: Ram Lavi <ralavi@redhat.com>
Adding the -e flag in order to know if script failed during the container-disk build.

Signed-off-by: Ram Lavi <ralavi@redhat.com>
@RamLavi RamLavi force-pushed the adjust_disak_image_to_rt branch from a5a6124 to b696eaa Compare November 20, 2023 12:52
@matosatti
Copy link

looks like the isolcpus=managed_irq,domain,1 is outputted only after I did a manual soft-reboot:

# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-348.rt7.130.el8.x86_64 root=UUID=a9332d7d-1762-41cd-a702-6b2cc556c248 ro console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=UUID=d4ac9572-e828-4c87-9b76-e59c9fa6e426 console=ttyS0,115200 skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 isolcpus=managed_irq,domain,1-2 intel_pstate=disable nosoftlockup nohz=on nohz_full=1-2 rcu_nocbs=1-2 irqaffinity=0 iommu=pt intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=1

@matosatti Is that necessary to reboot to image before running it?

After applying tuned-adm profile realtime-virtual-host? Yes.

@RamLavi
Copy link
Collaborator Author

RamLavi commented Nov 21, 2023

looks like the isolcpus=managed_irq,domain,1 is outputted only after I did a manual soft-reboot:

# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-348.rt7.130.el8.x86_64 root=UUID=a9332d7d-1762-41cd-a702-6b2cc556c248 ro console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=UUID=d4ac9572-e828-4c87-9b76-e59c9fa6e426 console=ttyS0,115200 skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 isolcpus=managed_irq,domain,1-2 intel_pstate=disable nosoftlockup nohz=on nohz_full=1-2 rcu_nocbs=1-2 irqaffinity=0 iommu=pt intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=1

@matosatti Is that necessary to reboot to image before running it?

After applying tuned-adm profile realtime-virtual-host? Yes.

you mean realtime-virtual-guest, right?

@RamLavi RamLavi force-pushed the adjust_disak_image_to_rt branch from b696eaa to c096df2 Compare November 21, 2023 20:17
disable irqbalance service before starting to use tuned since
tuned also manages the system IRQs and that might collide with
irqbalance.

Signed-off-by: Ram Lavi <ralavi@redhat.com>
@RamLavi RamLavi force-pushed the adjust_disak_image_to_rt branch 2 times, most recently from ec65d05 to 6d3e662 Compare November 22, 2023 13:39
@RamLavi
Copy link
Collaborator Author

RamLavi commented Nov 22, 2023

manually checking the image:

[Checking packages]:

dnf list | grep kernel-rt
kernel-rt-core.x86_64                                  4.18.0-348.rt7.130.el8                                @baseos   

dnf list | grep tuned
tuned.noarch                                           2.21.0-1.el8                                          @baseos   
tuned-profiles-nfv-guest.noarch                        2.21.0-1.el8                                          @nfv      
tuned-profiles-realtime.noarch                         2.21.0-1.el8                                          @rt       
tuned-gtk.noarch                                       2.21.0-1.el8                                          appstream 
tuned-profiles-atomic.noarch                           2.21.0-1.el8                                          baseos    
tuned-profiles-compat.noarch                           2.21.0-1.el8                                          baseos    
tuned-profiles-cpu-partitioning.noarch                 2.21.0-1.el8                                          baseos    
tuned-profiles-mssql.noarch                            2.21.0-1.el8                                          baseos    
tuned-profiles-oracle.noarch                           2.21.0-1.el8                                          baseos    
tuned-profiles-postgresql.noarch                       2.20.0-1.el8                                          appstream 
tuned-utils.noarch                                     2.21.0-1.el8                                          appstream 
tuned-utils-systemtap.noarch                           2.21.0-1.el8                                          appstream 

dnf list | grep rt-tests
rt-tests.x86_64                                        2.6-1.el8                                             @appstream

dnf list | grep tuned-profiles-realtime
tuned-profiles-realtime.noarch                         2.21.0-1.el8                                          @rt       

dnf list | grep tuned-profiles-nfv-guest
tuned-profiles-nfv-guest.noarch                        2.21.0-1.el8                                          @nfv      

[Checking HugePages]:
HugePages_Total:       1
HugePages_Free:        1
HugePages_Rsvd:        0
HugePages_Surp:        0

[Checking cmdline]:
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-348.rt7.130.el8.x86_64 root=UUID=a9332d7d-1762-41cd-a702-6b2cc556c248 ro console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=UUID=d4ac9572-e828-4c87-9b76-e59c9fa6e426 console=ttyS0,115200 iommu=pt intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=1 skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 isolcpus=managed_irq,domain,1-2 intel_pstate=disable nosoftlockup nohz=on nohz_full=1-2 rcu_nocbs=1-2 irqaffinity=0

[Checking uname]:
4.18.0-348.rt7.130.el8.x86_64

[Checking irqbalance]:
● irqbalance.service - irqbalance daemon
   Loaded: loaded (/usr/lib/systemd/system/irqbalance.service; disabled; vendor preset: enabled)
   Active: inactive (dead)

[Checking tuned-adm]:
- realtime                    - Optimize for realtime workloads
- realtime-virtual-guest      - Optimize for realtime workloads running within a KVM guest
Current active profile: realtime-virtual-guest

[Checking swapoff]:
              total        used        free      shared  buff/cache   available
Mem:          3.6Gi       1.2Gi       2.1Gi       8.0Mi       373Mi       2.4Gi
Swap:            0B          0B          0B

[Checking Osnat]:
# taskset -c 1 oslat --cpu-list 1 --rtprio 1 -D 5m -w memmove -m 4K
oslat V 2.60
Total runtime: 		300 seconds
Thread priority: 	SCHED_FIFO:1
CPU list: 		1
CPU for main thread: 	0
Workload: 		memmove
Workload mem: 		4 (KiB)
Preheat cores: 		1

Pre-heat for 1 seconds...
Test starts...
Test completed.

        Core:	 1
Counter Freq:	 2096 (Mhz)
    001 (us):	 0
    002 (us):	 3089086899
    003 (us):	 0
    004 (us):	 28
    005 (us):	 13
    006 (us):	 0
    007 (us):	 0
    008 (us):	 0
    009 (us):	 1
    010 (us):	 11
    011 (us):	 6
    012 (us):	 1
    013 (us):	 0
    014 (us):	 0
    015 (us):	 0
    016 (us):	 0
    017 (us):	 0
    018 (us):	 0
    019 (us):	 0
    020 (us):	 0
    021 (us):	 0
    022 (us):	 0
    023 (us):	 1
    024 (us):	 1
    025 (us):	 2
    026 (us):	 0
    027 (us):	 0
    028 (us):	 0
    029 (us):	 0
    030 (us):	 0
    031 (us):	 0
    032 (us):	 0 (including overflows)
     Minimum:	 1 (us)
     Average:	 2.000 (us)
     Maximum:	 24 (us)
     Max-Min:	 23 (us)
    Duration:	 299.864 (sec)

I think this hits all the marks. @matosatti @orelmisan can you take a look?

@RamLavi
Copy link
Collaborator Author

RamLavi commented Nov 22, 2023

passed CI on CNV4.14 cluster:

make e2e-test
mkdir -p /home/ralavi/go/src/github.com/kiagnose/kubevirt-realtime-checkup/_go-cache
podman run --rm \
	-v /home/ralavi/go/src/github.com/kiagnose/kubevirt-realtime-checkup:/go/src/github.com/kiagnose/kubevirt-realtime-checkup:Z \
	-v /home/ralavi/go/src/github.com/kiagnose/kubevirt-realtime-checkup/_go-cache:/root/.cache/go-build:Z \
	-v /home/ralavi/.kube/sno03-cnvqe2-rdu2:/root/.kube:Z,ro \
	--workdir /go/src/github.com/kiagnose/kubevirt-realtime-checkup \
	-e KUBECONFIG=/root/.kube/kubeconfig \
	-e TEST_NAMESPACE=realtime-checkup-1 \
	-e TEST_CHECKUP_IMAGE=quay.io/ramlavi/kubevirt-realtime-checkup:devel \
	-e VM_UNDER_TEST_CONTAINER_DISK_IMAGE=quay.io/ramlavi/kubevirt-realtime-checkup-vm:latest \
	docker.io/library/golang:1.19.4-bullseye \
	go test -v ./tests/...
=== RUN   TestTests
Running Suite: Tests Suite - /go/src/github.com/kiagnose/kubevirt-realtime-checkup/tests
========================================================================================
Random Seed: 1700673951

Will run 1 of 1 specs
•

Ran 1 of 1 Specs in 112.277 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
--- PASS: TestTests (112.28s)
PASS
ok  	github.com/kiagnose/kubevirt-realtime-checkup/tests	112.287s

Copy link
Member

@orelmisan orelmisan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes @RamLavi

vms/vm-under-test/scripts/customize-vm Show resolved Hide resolved
@@ -26,4 +26,4 @@ systemctl disable irqbalance
dnf --enablerepo=rt install -y tuned-profiles-realtime
dnf --enablerepo=nfv install -y tuned-profiles-nfv-guest

grubby --args="default_hugepagesz=1GB hugepagesz=1G hugepages=1" --update-kernel=ALL
grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=1" --update-kernel=ALL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matosatti is this setting necessary in our case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After an offline meeting, we have decided to drop it for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DONE

echo isolate_managed_irq=Y >> /etc/tuned/realtime-virtual-guest-variables.conf
tuned-adm profile realtime-virtual-guest

reboot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to avoid this reboot here?
I'm afraid that it will mess with the checkup's detection of when the VM had completed booting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we reboot manually on the checkup - then I'm afraid not.

@@ -27,3 +27,6 @@ dnf --enablerepo=rt install -y tuned-profiles-realtime
dnf --enablerepo=nfv install -y tuned-profiles-nfv-guest

grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=1" --update-kernel=ALL

swapoff -a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this command redundant if the swap is removed from /etc/fstab?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. I tried

@@ -22,6 +22,8 @@ set -e
mkdir /mnt/huge
mount /mnt/huge --source nodev -t hugetlbfs -o pagesize=1GB

systemctl mask "$(systemctl --type swap | grep '.swap' | awk '{print $1}')"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this command needed if the swap is removed from /etc/fstab?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Comment on lines 22 to 23
mkdir /mnt/huge
mount /mnt/huge --source nodev -t hugetlbfs -o pagesize=1GB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After an offline meeting, we have decided to drop huge pages for now.
PR #38 drops it from the VMI spec as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DONE

@RamLavi RamLavi force-pushed the adjust_disak_image_to_rt branch from 6d3e662 to 3915f5d Compare November 23, 2023 13:37
virt-builder does not seem to support enabling repos, so this
is instead done on the customize script.

Signed-off-by: Ram Lavi <ralavi@redhat.com>
The tuned-adm command needs a reboot in order to take effect.
In order to reboot during the virt-builder operation - the command
is done in a first-boot script.

Signed-off-by: Ram Lavi <ralavi@redhat.com>
Disabling swap space since it will introduce potential latency.
In order to permanently disable [0], this commit is:
- using the swapoff command.
- commenting the swap entries on /etc/fstab.
- masking the swap service with systemctl (done during the first-boot)

[0]
https://askubuntu.com/questions/440326/how-can-i-turn-off-swap-permanently/984777#984777

Signed-off-by: Ram Lavi <ralavi@redhat.com>
@RamLavi RamLavi force-pushed the adjust_disak_image_to_rt branch from 3915f5d to c688998 Compare November 23, 2023 13:37
Copy link
Member

@orelmisan orelmisan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

We should strive to remove the reboot from the first-boot script in a follow-up.

@orelmisan orelmisan merged commit ffb5a74 into kiagnose:main Nov 27, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants