Time Synchronization
The network time protocol helps computer systems to synchronize their time. We know this protocol by its shorter name NTP. In the past, it was not really a big issue if your system was a few minutes off. This changed with the interconnected world we are now living in. One of the better examples is networks relying on the authentication protocol Kerberos. If your system time is not correct, you may not be able to authenticate. This is because granted tickets have a built-in protection against timing attacks. While you may not be an attacker, the system will refuse to work when it finds requests being from the past or future.
When your local clock is not correct, serious damage could happen. Database data and log files could be incorrect, resulting in data loss at worst. For forensics, it might become very hard to reconstruct the steps occurred in a security incident. So having your Linux systems happily synchronized is a must. Let’s have a look how things work and how we can troubleshoot when things don’t work.
History of Time
We relied in the past on the system itself, to maintain a time. This was done by using a hardware component, which is named the real-time clock (RTC). But no device or component is 100% reliable, so your system time could slowly become “outdated”. If it went a little bit too quickly, you would be living in the future, according to your computer. For other systems, they would be living in the past. Systems are nowadays connected to other networks. This makes it possible to synchronize our times to very precise clocks. We call those atomic clocks. Instead of using digital components, they use the radiation of atomic particles. Then we can share the time with radio waves, so other systems can get synchronized.
Linux and Time
Most Linux systems use the following options to synchronize time
- No synchronization
- NTP daemon
- NTP client
- Other clients
No Synchronization
The first option “none” is obvious: there is no software installed on the system to maintain the time. While this may sound as a guarantee of getting out of sync, it isn’t always the case. Virtualized systems for example, may use the host system to get the right time. When starting such a system, they get the right time of the host, and be able to maintain it correctly during uptime. There is a risk of “skewing” (getting out of sync) if the client system is not able to count the cycles correctly, e.g. when the CPU speed is adjusted. Another risk is when the host system does not always give each client the same amount of time per CPU cycle, resulting in small variants in counting.
NTP Daemon
Next option is a NTP daemon. For Linux is typically a running process, or daemon, with the name ntpd. This process is waiting to receive time from several trusted sources. When it knows with a certain guarantee what the time is, it will instruct the kernel to use this new time, and synchronize it usually also with the hardware clock. This way hardware clock, Linux kernel and NTP daemon have the same understanding of the time. When the NTP daemon sees some skewing again, it will adjust the time again.
The process of time adjusting usually happens in small steps. This way other software on the systems doesn’t suddenly get confused. For example: it is now 4:43:52 PM and we would log something to a file. Then our NTP daemon decides to change the time 10 minutes back in time. Three minutes later we log another line to our file, which will be suddenly 4:36:52 PM. Not only does this get confusing in log files, it may corrupt data in databases and processes relying on network synchronization.
Common daemons
- ntpd
- openntpd (OpenBSD project)
NTP Client
A much simpler option is using a NTP client. It does a similar thing as a NTP daemon, except that it does not track the time from many sources. Instead, it requests the time of a trusted source, and acts upon that information. A tool like ntpdate or rdate are used this way, and scheduled by a cron job to regularly check the time and synchronize.
Common clients
- ntpdate
- rdate
Other Clients
The last category is the other clients. When using virtualized systems this option might be used. A toolkit like the VMware tools is then installed on the client, which will do system householding in the background. It will exchange data with the host system to ensure things are in sync, including its time.
Time Troubles
As with most software, things can go wrong. Many of us rarely check if our time sources are properly configured and still work correctly. We just assume the time is correct and the system does the synchronization correctly, right? Especially when using a NTP daemon, things can go wrong. Its configuration needs to be set-up correctly, and checked regularly. If not, sooner or later, time will skew and result in being a few minutes off.
False-tickers
The first category of NTP troubles is when using a so-called “false-ticker”. Like our own system can be incorrect, a trusted time source can be incorrect. It can be happening on purpose, misconfiguration, or hardware issues. If we rely on such a resource, our time will be wrong as well. If you are using the NTP daemon together with ntpq, these false-tickers can be recognized with a “x” in front of the entry.
Stratum 16
Another thing to check for is the “stratum 16” entries. We refer to an atomic clock or a reference clock as stratum 0. Stratum 1 devices collect the time from a stratum 0 device, usually via radio waves (GPS, CMDA, etc). Then our own systems are usually at stratum 2 or 3. If an entry shows stratum 16, something is wrong. It might not be able to synchronize its date. This may be occurring when it can’t find the source. Something as simple as iptables filtering too much traffic.
Unreliable Sources
The last category consists of sources which are unreliable. Because the NTP daemon receives time information from a configured set of systems, it will check them with regular intervals. It will compare the data received from the sources, and take factors like distance and network delay in account. When it finds that a source provides unexpected results, it will be marked as unreliable. You can solve this by using different sources which are closer to you, or even internal. If it already an internal network source, then something might be wrong with the device. Most likely multiple systems will mark the same system as unreliable. When using a NTP daemon (and ntpq), these items are marked with a minus (-).
Time out of sync
Good to know is that NTP daemons usually won’t synchronize in big steps, as previously described. If time is too far off, it may even stop functioning, which is on purpose. This is an indirect warning that the time should be correctly manually. Best way to handle this is stopping first all process relying on time synchronization. Then manually synchronize time with a tool like ntpdate or rdate.
Discover Time Issues
So now we know it is important to track the time, and keep it synchronized it properly. Using the ntpq utility we can query the details of our time synchronization. In particular, we can see what sources are used, and any issues.
No sources can be reached, showing stratum 16
The best way to discover time synchronization issues is by monitoring the output of ntpq when using a NTP daemon. If you are using a NTP client, then it would make sense to compare it to trusted source and see if it does not differs too much (e.g. a few seconds). You could add tests to your monitoring tool to validate your time configuration on a regular basis.
Although ntpstat
shows the status as synchronised to NTP server at stratum 3
, timedatectl
still shows NTP synchronized: no
Resolution
Workaround
Run the
ntpd
deamon without the option-x
Edit the
ntpd
configuration file and remove the-x
option# vi /etc/sysconfig/ntpd
Restart the
ntpd.service
# systemctl restart ntpd.service
Root Cause
The kernel maintains an "unsynchronized" flag for the system clock. The timedatectl
program will print "NTP synchronized: yes" only if this flag is cleared (set to zero). It doesn't support the protocol which ntpstat
uses to query the state of ntpd
.
ntpd
can control the system clock using two different system functions:
- ntp_adjtime()
enables a phase-locked loop implemented in the kernel (aka kernel discipline), which automatically corrects the frequency offset of the clock (drift) and it needs to be called only when a new measurement is made. It clears the "unsynchronized" flag in the kernel. The main limitation is that it cannot correct offsets larger than 0.5 seconds.
- adjtime()
is an older method, which makes a one-time adjustment of the clock (slew). It doesn't correct the frequency offset, so it needs to be called frequently to compensate for it, even when no measurement is made. It cannot clear the "unsynchronized" flag in the kernel, but it can correct any offset.
ntpd
can use ntp_adjtime()
or adjtime()
, but not both at the same time. By default it uses ntp_adjtime()
. If the step threshold is set to a larger value than 0.5 seconds (e.g. by enabling the -x
option), it has to switch to adjtime()
, because ntp_adjtime()
does not work with larger offsets.
That means the kernel "unsynchronized" status will not be cleared and timedatectl
will report "NTP synchronized: no" when ntpd
is started with the -x
option.
How to verify that timedatectl
does not show correct status:
# timedatectl | grep NTP
NTP enabled: yes
NTP synchronized: no
# ntpstat
synchronised to NTP server (10.0.0.1) at stratum 3
time correct to within 1026 ms
problem: ntpd in "slew" mode (ntpd -x). With running ntpd in "slew" mode, timedatectl does always show "NTP synchronized: no" while ntpstat correctly show "synchronised to NTP server (10.11.160.238) at stratum 2 " Version-Release number of selected component (if applicable): RHEL7.6 systemd-219-62.el7.x86_64 How reproducible: remove chrony package, install ntp package, configure ntpd to run in "slew" mode, wait some time to be synchronized, test timedatectl vs. ntpstat Steps to Reproduce: 1. yum remove chrony 2. yum install ntp 3. vi /etc/sysconfig/ntpd and add -x at the end of the OPTIONS line -> OPTIONS="-g -x" 4. systemctl enable ntpd.service 5. systemctl start ntpd.service / or / timedatectl set-ntp 1 6. *wait some time to be synchronized* 7. ntpstat 8. timedatectl | grep NTP Actual results: # timedatectl | grep NTP NTP enabled: yes NTP synchronized: no # ntpstat synchronised to NTP server (10.11.160.238) at stratum 2 time correct to within 10 ms polling server every 64 s # ps -wauxxx | grep ntp ntp 4343 0.0 0.0 29944 2140 ? Ss 11:33 0:00 /usr/sbin/ntpd -u ntp:ntp -g -x Expected results: # timedatectl | grep NTP NTP enabled: yes NTP synchronized: yes
ntpd running with the -x option doesn't tell the kernel that the clock is synchronized and timedated only uses the information returned by the kernel. It doesn't talk to ntpd or chronyd
How to check NTP is working?
- ntpq – standard NTP query program
- ntpstat – show network time synchronisation status
- timedatectl – show or set info about ntp using systemd
Let us see all commands and examples in details
Verify NTP is working or not with ntpstat command
The ntpstat command will report the synchronisation state of the NTP daemon running on the local machine. If the local system is found to be synchronised to a reference time source, ntpstat will report the approximate time accuracy.
exit status of ntpstat command
You can use the exit status (return values) to verify its operations from a shell script or command line itself:
- If exit status 0 – Clock is synchronised.
- exit status 1 – Clock is not synchronised.
- exit status 2 – If clock state is indeterminant, for example if ntpd is not contactable.
Type the command as follows:$ ntpstat
Sample outputs:
synchronised to NTP server (149.20.54.20) at stratum 3 time correct to within 42 ms polling server every 1024 s
Use the echo command to display exit status of ntp client:$ echo $?
Sample outputs:
0
Checking the status of NTP with ntpq command
The ntpq utility program is used to monitor NTP daemon ntpd operations and determine performance. The program can be run either in interactive mode or controlled using command line arguments. Type the following command on your Linux or Unix-based system:$ ntpq -pn
OR$ ntpq -p
Sample outputs:
remote refid st t when poll reach delay offset jitter ============================================================================== *dione.cbane.org 204.123.2.5 2 u 509 1024 377 51.661 -3.343 0.279 +ns1.your-site.c 132.236.56.252 3 u 899 1024 377 48.395 2.047 1.006 +ntp.yoinks.net 129.7.1.66 2 u 930 1024 377 0.693 1.035 0.241 LOCAL(0) .LOCL. 10 l 45 64 377 0.000 0.000 0.001
* the source you are synchronized to (syspeer). The above is an example of working ntp client. Where,
- -p : Print a list of the peers known to the server as well as a summary of their state.
- -n : Output all host addresses in dotted-quad numeric format rather than converting to the canonical host names.
Another reliable source is running the following command:$ ntpq -c rv
Look for the leap code as follows:
So leap code 0 (leap_none) means normal synchronized state. And leap code 3 (leap_alarm) means NTP wasnever synchronized. Here is a sample outputs:
A note about timedatectl command
If you are using systemd based system, run the following command to check the service status# timedatectl status
Sample outputs:
Is my NTP (systemd-timesyncd) Working?
systemd-timesyncd configuration
If NTP enabled is set to No. Try configuring by editing /etc/systemd/timesyncd.conf file as follows:# vi /etc/systemd/timesyncd.conf
Append/edit [Time] as follows i.e. add time servers or change the provided ones, uncomment the relevant line and list their host name or IP separated by a space (default from my Debian 8.x server):
[Time] Servers=0.debian.pool.ntp.org 1.debian.pool.ntp.org 2.debian.pool.ntp.org 3.debian.pool.ntp.org
Save and close the file. Finally, start and enable it, run:# timedatectl set-ntp true
# timedatectl status
Sample outputs:
Local time: Mon 2019-09-30 18:25:38 IST Universal time: Mon 2019-09-30 12:55:38 UTC RTC time: Mon 2019-09-30 12:55:38 Time zone: Asia/Kolkata (IST, +0530) System clock synchronized: yes systemd-timesyncd.service active: yes RTC in local TZ: no
The above is easy way to verify NTP is working on Linux.
Only time when chrony sets time
When the chrony service starts, there are some settings in the /etc/chrony/chrony.conf file that tells it to actually set the time if specific conditions occur:
# Force system clock correction at boot time. makestep 1000 10
which means that if chrony detects during the first 10 measurements after its start that the time is off by more than 1000 seconds it will set the clock.
Some useful commands
Below are some useful commands which can be used for the troubleshooting of chrony related issues.
# chronyc tracking # chronyc sources # chronyc sourcestats # systemctl status chronyd # chronyc activity # timedatectl
Check chronyd status
To check the status of the chronyd daemon :
# systemctl status -l chronyd ● chronyd.service - NTP client/server Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2016-08-12 13:22:22 IST; 1s ago Process: 33263 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS) Process: 33259 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS) Main PID: 33261 (chronyd) CGroup: /system.slice/chronyd.service └─33261 /usr/sbin/chronyd Aug 12 13:22:22 NVMBD1S11BKPMED03 systemd[1]: Starting NTP client/server... Aug 12 13:22:22 NVMBD1S11BKPMED03 chronyd[33261]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH) Aug 12 13:22:22 NVMBD1S11BKPMED03 chronyd[33261]: Frequency 0.000 +/- 1000000.000 ppm read from /var/lib/chrony/drift Aug 12 13:22:22 NVMBD1S11BKPMED03 systemd[1]: Started NTP client/server.
The chronyc sources command
Running chronyc sources -v shows the current state of the NTP server/s configured in the system. Here is an example output, in which ntp.example.com shows as a valid server which is online:
# chronyc sources -v 210 Number of sources = 1 .-- Source mode '^' = server, '=' = peer, '#' = local clock. / .- Source state '*' = current synced, '+' = OK for sync, '?' = unreachable, | / 'x' = time may be in error, '~' = time is too variable. || .- xxxx [ yyyy ] +/- zzzz || / xxxx = adjusted offset, || Log2(Polling interval) -. | yyyy = measured offset, || | zzzz = estimated error. || | | MS Name/IP address Stratum Poll LastRx Last sample ============================================================================ ^* ntp.example.com 3 6 40 +31us[ -98us] +/- 118ms
Note that a Source state different than ‘*’ usually indicates a problem with the NTP server.
Source state ‘~’
means that the time is too variable
If the Source state is ‘~‘,
it probably means that the server is accessible but the time is too variable.
This can happen if the server responds too slow or responds sometimes slower and
sometimes faster. You could check the response time of the pings to the server
to see if they are slow or variable. This state has also been noticed when the
server is running on virtual machines which are too slow causing timing
issues.
Chrony check and restart every hour
Once an hour, the chrony service checks the output of the chronyc sources -v command, by running script /usr/sbin/palladion_chrony_healthcheck which runs /usr/sbin/palladion_check_chrony and checks its output:
- if /usr/sbin/palladion_check_chrony returns 1 – it means there was no online source (no source with Source state = ‘*’) , so chrony restarts in an attempt to re-initialize the server status
- if /usr/sbin/palladion_check_chrony returns 0 – this means everything is ok, chrony does not need to be restarted because it already has a valid online source
# cat /etc/cron.d/chrony SHELL=/bin/sh PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin # # Check chrony every hour and restart if necessary. # 16 * * * * root /usr/sbin/palladion_chrony_healthcheck
Chrony logs
There are several chrony logs that can be used to troubleshoot. Most of them are located in /var/log/chrony/. Note that the latest file is not always the *.log one. Sometimes it happens that even the *.log.2 or *.log.3 file are the ones that are more recent. Here is an example of listing the files with sorting by the most recent:
# ls -lisaht /var/log/chrony/ total 1.5M 3801115 580K -rw-r--r-- 1 root root 574K Oct 21 14:56 measurements.log.3 3801131 544K -rw-r--r-- 1 root root 540K Oct 21 14:56 statistics.log.3 3801166 356K -rw-r--r-- 1 root root 350K Oct 21 14:56 tracking.log.3 3801089 4.0K drwxr-xr-x 16 root root 4.0K Oct 21 00:01 .. 3801114 4.0K drwxr-xr-x 2 root root 4.0K Oct 21 00:01 . 3801128 0 -rw-r--r-- 1 root root 0 Oct 21 00:01 tracking.log 3801110 0 -rw-r--r-- 1 root root 0 Oct 21 00:01 measurements.log 3801120 0 -rw-r--r-- 1 root root 0 Oct 21 00:01 statistics.log 3801167 0 -rw-r--r-- 1 root root 0 Oct 20 00:01 tracking.log.1 3801165 0 -rw-r--r-- 1 root root 0 Oct 20 00:01 statistics.log.1 3801159 0 -rw-r--r-- 1 root root 0 Oct 20 00:01 measurements.log.1 ............
Try setting only one NTP server by entering its IP address
If until now you have been using two or more NTP servers (either because they were set or because you entered an FQDN that resolves in different IP addresses), try to set one single NTP server by entering only one IP address. This may solve your NTP related issue.
Tracing the communication with the NTP server
To double check if the NTP server is answering or
not, it is possible to trace the traffic between chrony and the NTP server for a
period of time while monitoring the server:
1.
Start a pcap trace with tcpdump on the NTP port 123 and leave it running until
the issue appears (run it in ‘screen’ or with ‘nohup’ to avoid it from being
stopped if you disconnect from the shell command)
2.
As soon as the issue re-appears, get a System Diagnostics covering the entire
history since you have set the server to DNS name until the gap reoccurred. If
this produces a file that is too big, just get the System Diagnostics for
Current data and in addition copy all the files from /var/log/chrony/, and all
files called /var/log/syslog* . Remember to stop the trace you started at step
1
Commands list:
timedatectl
systemctl start ntpd
systemctl enable ntpd
systemctl disable chronyd
timedatectl set-ntp on
timedatectl set-ntp true
systemctl restart ntpd.servicesystemctl enable ntpd.service
chkconfig ntpd on
timedatectl set-ntp 1
vi /etc/sysconfig/ntpd
ntpstat
ntpdate ntp.1.comps -aux | grep ntp
ntp 4343 0.0 0.0 29944 2140 ? Ss 11:33 0:00 /usr/sbin/ntpd -u ntp:ntp -g -x# timedatectl | grep NTP