I’ve been trying to get my new GitLab instance working at home, and am having issues getting the frontend and backend nodes to work together.
# sudo gitlab-rake gitlab:gitaly:check Checking Gitaly ... Gitaly: ... default ... FAIL: 7:permission denied. debug_error_string:{"created":"@1638634177.607305857", "description":"Error received from peer ipv4:192.168.0.7:8075","file":"src/core/lib/surface/call.cc", "file_line":1055,"grpc_message":"permission denied","grpc_status":7}
After double and triple checking everything, I stumbled over this in the GitLab docs
Ensure the Gitaly clients and servers are synchronized, and use an NTP time server to keep them synchronized.
And so commenced a journey down the NTP rabbit hole.
yup, it’s broken
$ ssh rails date Sat 4 Dec 16:49:20 GMT 2021 $ ssh gitaly date Sat 4 Dec 16:07:07 GMT 2021
Having not had to worry about this before, I found it hard to figure out how it’s supposed to work. Googling suggesting this was the right package:
# rpm -qa | grep ntp ntpdate-4.2.6p5-29.el7.centos.2.x86_64
But, it’s not active
# systemctl list-units --all | grep ntp ● ntpd.service not-found inactive dead ntpd.service ntpdate.service loaded inactive dead Set time via NTP ● sntp.service not-found inactive dead sntp.service
And my working systems all looked the same
systemd to the rescue
It does get a bit tedious to find systemd doing new stuff all the time, particularly when it’s broken.
The userspace tool is timedatectl:
gitaly # timedatectl Local time: Sat 2021-12-04 15:58:45 GMT Universal time: Sat 2021-12-04 15:58:45 UTC RTC time: Sat 2021-12-04 15:58:44 Time zone: Europe/London (GMT, +0000) NTP enabled: yes NTP synchronized: yes RTC in local TZ: yes DST active: no Last DST change: DST ended at Sun 2021-10-31 01:59:59 BST Sun 2021-10-31 01:00:00 GMT Next DST change: DST begins (the clock jumps one hour forward) at Sun 2022-03-27 00:59:59 GMT Sun 2022-03-27 02:00:00 BST rails # timedatectl Local time: Sat 2021-12-04 16:58:49 GMT Universal time: Sat 2021-12-04 16:58:49 UTC RTC time: Sat 2021-12-04 16:14:14 Time zone: Europe/London (GMT, +0000) NTP enabled: yes NTP synchronized: no RTC in local TZ: yes DST active: no Last DST change: DST ended at Sun 2021-10-31 01:59:59 BST Sun 2021-10-31 01:00:00 GMT Next DST change: DST begins (the clock jumps one hour forward) at Sun 2022-03-27 00:59:59 GMT Sun 2022-03-27 02:00:00 BST
And there’s a service
# systemctl status systemd-timedated.service ● systemd-timedated.service - Time & Date Service Loaded: loaded (/usr/lib/systemd/system/systemd-timedated.service; static; vendor preset: disabled) Active: active (running) since Sat 2021-12-04 17:09:48 GMT; 37s ago Docs: man:systemd-timedated.service(8) man:localtime(5) http://www.freedesktop.org/wiki/Software/systemd/timedated Main PID: 24502 (systemd-timedat) CGroup: /system.slice/systemd-timedated.service └─24502 /usr/lib/systemd/systemd-timedated Dec 04 17:09:48 gander systemd[1]: Starting Time & Date Service... Dec 04 17:09:48 gander systemd[1]: Started Time & Date Service.
The Rails server is showing as “NTP synchronized: no”, and the system clock is not set the same as the real time clock.
- Calling ‘timedatectl set-ntp true’ didn’t fix that
- Calling ‘timedatectl –adjust-system-clock’ didn’t reset the server time to the RTC as implied by the man page.
face palm
The man page for timedatectl is a bit unclear.
--adjust-system-clock If set-local-rtc is invoked and this option is passed, the system clock is synchronized from the RTC again, taking the new setting into account. Otherwise, the RTC is synchronized from the system clock. set-local-rtc [BOOL] Takes a boolean argument. If "0", the system is configured to maintain the RTC in universal time. If "1", it will maintain the RTC in local time instead. Note that maintaining the RTC in the local timezone is not fully supported and will create various problems with time zone changes and daylight saving adjustments. If at all possible, keep the RTC in UTC mode. Note that invoking this will also synchronize the RTC from the system clock, unless --adjust-system-clock is passed (see above). This command will change the 3rd line of /etc/adjtime, as documented in hwclock(8).
No warning is generated with calling timedatectl without a mode. I think it just spits out the status.
I wanted to avoid resetting the RTC, because it was more correct. At the bottom of the status output, it was also stating:
Warning: The system is configured to read the RTC time in the local time zone. This mode can not be fully supported. It will create various problems with time zone changes and daylight saving time adjustments. The RTC time is never updated, it relies on external facilities to maintain it. If at all possible, use RTC in UTC by calling 'timedatectl set-local-rtc 0'.
So I tried it.
Wrong. It’s skewed the RTC off by an hour.
# timedatectl status Local time: Sat 2021-12-04 18:01:05 GMT Universal time: Sat 2021-12-04 18:01:05 UTC RTC time: Sat 2021-12-04 18:01:03 Time zone: Europe/London (GMT, +0000) NTP enabled: yes NTP synchronized: no RTC in local TZ: no DST active: no
Regardless of the warnings, it didn’t fix the problem, so it’s going back.
# timedatectl set-local-rtc 1 # timedatectl set-time "2021-12-04 17:24:30" Failed to set time: Automatic time synchronization is enabled # timedatectl set-ntp false # timedatectl set-time "2021-12-04 17:24:30" # timedatectl set-ntp true
Now the time’s right, though it’s drifting already, so there’s a problem to fix there.
chronyd
Rooting around in the timedatectl man page, I saw a mention of chronyd, and sure enough:
# systemctl status chronyd.service ● chronyd.service - NTP client/server Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2021-12-04 17:24:49 GMT; 4min 50s ago Docs: man:chronyd(8) man:chrony.conf(5) Process: 24823 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS) Process: 24820 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS) Main PID: 24822 (chronyd) CGroup: /system.slice/chronyd.service └─24822 /usr/sbin/chronyd
Seems to be configured using /etc/chrony.conf
on clock sources
(Update +2hrs) I suspect a problem tracking the passage of time. I vaguely recall having similar issues with VMs on this hardware. (An old HP microserver with AMD CPU).
There’s a good public Redhat knowledge base article.
# cat /sys/devices/system/clocksource/*/current_clocksource tsc # cat /sys/devices/system/clocksource/*/available_clocksource tsc hpet acpi_pm
I added clocksource=hpet to GRUB_CMDLINE_LINUX in /etc/default/grub. The steer from the articles was that “On modern hardware (ACPI power management timer) should be used as a clock source of last resort only.”
# check this exists. RHEL will be 'redhat', YMMV on other related distros # /boot/grub2/grub.cfg for non-EFI grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg
After rebooting, the clock looked OK initially, and nothing worrying in the logs. I also set the RTC to UTC as requested by timedatectl, and this time it didn’t get confused as pick the wrong UTC.
egrep -i 'time|tfc|tsc|hpet' /var/log/dmesg timedatectl status timedatectl set-local-rtc 0
And GitLab is working now.
to be continued ..
I think there’ll be more before I’m done.