NTP Basics

The NTP reference implementation, ntpd, has been designed to query the time from one or more configured reference time sources, synchronize its own system time to those reference time sources, and at the same time work as NTP server to make its own synchronized system time available to other NTP clients on the network.

The program was originally written for Unix-like systems, but has also been ported to Windows. Meinberg makes a pre-compiled NTP package for Windows available to simplify installation on Windows.

Unlike some other programs for time synchronization which simply step the system time in periodic intervals, ntpd runs in the background and continuously adjusts its own system time as accurately as possible, in a way that is not even noticeably for applications. It even measures and compensates its own system clock drift, so that a significant time offset can not even arise during continuous operation of the service.

The NTP software package includes a couple of executable programs. ntpd is the NTP daemon (the NTP service on Windows) that runs in the background, and ntpq is the most important command line tool to check the status of the daemon/service which runs in the background.

Right after startup, ntpd sets its internal status to stratum 16 and leap bits 11 to indicate that it is not yet synchronized. This status information is also put into the packets that are sent out to the network to synchronize clients, so NTP clients know the server is alive but is yet unable to provide an accurate time.

Large initial time offsets are only accepted and quickly corrected immediately after startup. See chapter Handling Large Time Offsets for the reasons.

It's a policy of ntpd that any reference time source is only accepted if the time source claims to be synchronized, no matter if the time source is a so-called hardware refclock, e.g. a GPS clock, long wave receiver, etc., or another NTP server on the network:

If the time source is a GPS receiver then the receiver needs to be synchronized to the signals from the GPS satellites.

If the time source is another (so-called “upstream”) NTP server then that server needs to be synchronized to some time source, too, which can in turn be a GPS receiver, or some other upstream NTP server(s).

So only if a time source provides the time and claims to be synchronized the time source is considered reachable and may be accepted by an NTP client. As mentioned above, a server instance of ntpd that has just been started will not immediately be accepted by any client. Only after the server was able to synchronize to its own configured time source(s), it starts claiming to be synchronized, and thus might be accepted by NTP clients.

Each configured reference time source is polled in certain, regular intervals. Polling means that the time and status are queried from a time source.

Depending on the type of time source, it takes some time until a submitted request arrives at its destination, and similarly it takes some time until a reply is received. Specifically, queries across the network require sending a request packet to a server, and waiting for a reply packet from the server. See this article for details.

The time accuracy of the client is not affected by the absolute magnitude of the polling delays, as long as the delays are exactly the same for the requests and replies. However, if the delay for requests is always shorter than for replies, or vice versa, e.g. on an ADSL network connection with different upload and download speeds, then this can't be determined automatically by the client, so this results in a systematic time error at the client side, depending on the ratio of the request and reply delay. This means that the real time offset may be e.g. a few milliseconds, even though the time offset computed by the client is reported as “0”.

In real life, network packet transportation generally suffers from a mean network delay caused by routers and switches, and variations of the network delay at subsequent pollings, if individual packets are more or less delayed.

If ntpd queries the time from a GPS clock and/or PPS source then the jitter is usually at the microsecond level. The network jitter on a LAN between different NTP nodes can be some tens of microseconds, and over a WAN connection it may be even milliseconds. So a small time offset from a GPS receiver can quickly be identified, but if several pollings over a WAN yield different time offsets, it is not clear if this is really a time offset, or it just looks like a time offset because a network packet has been queued in a switch or router.

So jitter reduces the accuracy of the computed time offset, and some filtering is required to reduce the jitter as good as possible.

This is why ntpd always evaluates the results from several polling actions over several polling intervals before it starts adjusting its own system time. ntpd has very powerful adaptive filters to determine the mean packet delay, network jitter, its own time offset, and how the time offset evolves, i.e. how much its own system time drifts.

Each time source that is reachable (i.e., replies to queries, and is synchronized) is basically accepted by ntpd. However, if several reference time sources have been configured and are reachable then the time provided by each source, its jitter etc. is evaluated to find a majority of time sources that provide the same, accurate time, and select the “best” source of this group as so-called system peer. See:

NTP docs: Mitigation Rules and the prefer Keyword
https://www.meinbergglobal.com/download/ntp/docs/html/prefer.html

Please note that the prefer keyword has only limited effect on the selection process. Usually the best approach is to simply let ntpd select the system peer.

The selected system peer is marked with a * (or an o in case of a PPS refclock) in the output of the command ntpq -p.

If there are several good reference time sources available then other so-called survivors of the selection process are also potential system peers, so they are called candidates which are marked with a + in the output of the command ntpq -p. If there are other time sources which provide a time that differs from the survivors' time, these time sources are called falsetickers which are marked with a -.

The correction value for the own system time is derived from the weighted time offset from the system peer and the candidates. While this is a good approach for pure network clients, unfortunately this also means that the system time accuracy can be degraded for example if the main time source is a GPS refclock. The GPS clock provides very high accurate and precise time, but if there are additional reference time sources configured which are accessed via the network, those time sources can become candidates and thus may contribute to the weighted clock adjustment. Since network time sources usually provide significantly less accuracy and precision than a GPS refclock, this makes the system time adjustment worse than it could be if the GPS clock alone was used as referencer time source.

The described behavior suggests that it's good practice to configure only a single time source, so the time is always accepted, or more than 2 time sources, so that the selection algorithm can always determine a majority of good time sources. Due to the way the selection algorithm works there are certain quantities of time sources to be configured which yield the best results. See:

http://support.ntp.org/bin/view/Support/SelectingOffsiteNTPServers#Section_5.3.3.

Specifically, please note that configuring exactly 2 reference time sources is the worst you can do: if both time sources provide a slightly different time then the client ntpd is unable to determine which one provides the “right” time, and may finally ignore both time sources.

The polling interval is adjusted automatically by ntpd depending on the stability of the local system clock. The default range for the polling interval is 2^6s = 64s up to 2^10s = 1024s.

Normally it's a good idea to let ntpd itself adjust the polling intervals for its time sources. However, there are only some specific cases where this should be limited by configuration, e.g. for the the Windows port of ntpd where the polling interval should be fixed to 2^6s = 64s.

Also for refclocks it may make sense to specify a fixed polling interval, depending on the refclock type.

Of course, if a time source is polled at a large interval it takes longer until an unexpected time step is detected than with a short polling interval, but it takes anyway a couple of minutes until it is eventually accepted.

On the other hand, if polling intervals shorter than 6 (64 seconds) are used with public NTP servers, operators of those public servers usually consider this as abusive, so this should be avoided.

Once a time source has been selected as system peer, ntpd starts to adjust its own system time and changes its leap bits from 11 to 00. However, if a leap second is announced for the end of the current UTC day the leap bits become 01 in case of a positive leap second (insertion), or 10 in case of a negative leap second (deletion). The latter has not yet ever happened. All leap bit combinations except 11 indicate that ntpd is synchronized.

A synchronized ntpd also changes its stratum number to the stratum of its system peer, plus 1. So if the system peer is a hardware refclock then the reference time source has an internal stratum 0, so this instance of ntpd becomes a stratum 1 NTP server on the network. A client that selects this server as system peer becomes itself a stratum 2 server, a client of that stratum 2 server becomes a stratum 3 server, and so on.

Unlike the stratum number used in the context of telecom applications, which defines a specific accuracy class, the stratum number used with NTP just indicates a hierarchy level, but does not guarantee a specific accuracy.

If clients receiving NTP response packets from this ntpd server see that the leap bits from this server are not 11, and the stratum is not 16, they start accepting the time from this server in the same way as described in chapter Polling And Accepting Time Sources.

A control loop evaluates the determined time offset and clock drift, and applies corrections to the own system time.

As long as the system time offset determined by the filter algorithm is below a certain limit (the so-called step threshold, 128 ms by default), the system time is adjusted slowly and smoothly in a way that both the time offset, and the system clock drift become as small as possible, so that a new significant time offset doesn't even accumulate. See also

NTP docs: Step and Stepout Thresholds
https://www.meinbergglobal.com/download/ntp/docs/html/clock.html#step

So during normal operation, the system time is adjusted in a way that applications don't even notice the small corrections. However, how quickly and accurately a time offset is determined and compensated depends on the jitter seen by the filter algorithms from the last polling actions.

The rate for the system time adjustment is limited to 500 ppm, i.e. 500 microseconds per second, or 1.8 seconds per hour, which is usually sufficient for real applications.

If a large time offset is observed which exceeds the step threshold then the system time is about to be stepped to correct this.

Normally this should only happen once, immediately after ntpd has started, if the system time is not yet very accurate, so only in this case the system time is stepped quickly to get the time offset below the step threshold limit and continue with the smooth adjustment.

However, if the system time has already been accurately disciplined, but afterwards a system time offset is detected that exceeds the step threshold, ntpd waits for the so-called stepout interval (300 s by default since ntpd 4.2.8, 900 s by default until ntpd 4.2.6) to see if the large time offset persists, and then checks if the time offset is above or below the so-called panic threshold (1000 s by default). See:

NTP docs: Panic Threshold
https://www.meinbergglobal.com/download/ntp/docs/html/clock.html#panic

If a large time offset occurs while 'ntpd' is already running, this can be due to one of the following reasons:

An operator has changed the system time. This requires admin rights, so it's the operator's own problem if thinks he has to mess up the timekeeping.

Another time synchronization software is running. It's never a good idea to have more than one program running in parallel to discipline the system time, so all programs but a single one should be disabled.

The system timekeeping is broken. There have been cases where the time on a Windows server lost more than 30 seconds whenever a huge database application ran some maintenance tasks in the night. A program like ntpd is unable to compensate this, so the bad programs should be fixed instead.

So if the system time offset still exceeds the panic threshold after the stepout interval, ntpd terminates itself with a message saying something like, “set clock manually”. The reason behind this behavior is that ntpd means, “The system time has been changed. That must have been done by the administrator who should know what he's doing. So I can't do anything else and terminate myself.”.

So a huge time offset that exceeds the panic threshold is accepted only once at startup, if ntpd is started with the '-g' option (which is usually the case).

Whenever ntpd steps the system time, all filter values from previous polls are discarded, and the control loop starts over from scratch.

Depending on the logging options configured for ntpd, a “time reset” message may be written to the operating system's logging utility whenever ntpd steps the system time.

Depending on the type and version of the operating system, there may also be a log message from the operating system whenever the system time is stepped by some application.

As mentioned earlier, ntpd has not been designed to immediately correct large time steps that suddenly occur during operation.

If the system time is changed by some other program, or by the administrator (“to see if NTP really works”) while ntpd is already running, the system time is not corrected quickly by ntpd as some folks might expect.

The reason is that the control loop used by ntpd to accurately adjust the system time is totally messed up if someone else fiddles with the system time. In any case it takes a few minutes (the stepout threshold) until the time step is accepted, the system time is stepped, and ntpd has to start polling/filtering from scratch. Eventually ntpd even terminates itself, if the offset is too large. For details see chapter Handling Large Time Offsets.

The next paragraph shows a better way how to monitor the performance of ntpd.

A good way to check that ntpd is working properly is to run the command ntpq -p periodically. The output contains a table with one line of status information for each configured time source. The table has the following columns:

*remote*	The (eventually truncated) host name or IP address of the time source.
*refid*	An informational indicator telling where this time source gets its time from. Can be a 4 character string, an IPv4 address, or the hash of an IPv6 address displayed like an IPv4 address.
st	The stratum of the time source, which can be 16 if the source is not reachable or not synchronized
t	The type of the time source, e.g. `l` for a local hardware refclock, or `u` for an upstream NTP server accessed via unicast data packets.
*when*	The time after the last poll event. When when reaches the value of poll then the next polling action occurs.
*poll*	The current polling interval, in seconds.
*reach*	An octal display of the reach status. Whenever a polling event is successful, i.e. the time source is accessible and synchronized, a logic `1` bit is shifted in from right, else a logic `0` bit. So right after startup the `reach` value is 0, and after each successfull pollings it increases `1`, `3`, `7`, `17`, `37`, etc. until `377`, which means the last 8 pollings were successful. During continuous operation the `reach` value stays at `377` for a time source that is continuously reachable.
*delay*	The mean packet delay, in milliseconds. This is the mean execution time required to send a read request to the time source, and receive the reply from that source.
*offset*	The mean time offset, in milliseconds.
*jitter*	The time jitter, in milliseconds. This indicates how much packet delays from individual pollings vary from the mean packet delay.

Please note the delay, offset, and jitter are all computed from the same four time stamps provided by each polling action, so they are related to each other, and all values settle when the control loop which adjusts the system time is settling.

Next there are some examples for the output of the ntpq -p command, run on a Linux workstation with a built-in Meinberg GPS PCI card. The time source labeled SHM(0) with refid .shm0. represents a hardware refclock, where the time from the GPS PCI card is fed into ntpd's shared memory driver. lt-martin is a GPS controlled Meinberg LANTIME NTP server on the local network. The three ptbtime nodes are NTP servers accessed via the internet.

The example below shows the result of an ntpq -p command immediately after ntpd was started:

     remote       refid    st t when poll reach   delay   offset  jitter
========================================================================
 SHM(0)          .shm0.     0 l    -    8    0    0.000    0.000   0.000
 lt-martin.py.me .INIT.    16 u    -   64    0    0.000    0.000   0.000
 ptbtime1.ptb.de .INIT.    16 u    -   64    0    0.000    0.000   0.000
 ptbtime2.ptb.de .INIT.    16 u    -   64    0    0.000    0.000   0.000
 ptbtime3.ptb.de .INIT.    16 u    -   64    0    0.000    0.000   0.000

The reach column for all time sources is 0, so none of the time sources has been polled, yet. For the upstream NTP servers this is indicated by a stratum value 16, and a refid reading .INIT.. Also no line has an asterisk * mark at the beginning, so there is yet no system peer, and thus ntpd has a status saying it is not synchronized.

A short time later the output has changed:

     remote       refid    st t when poll reach   delay   offset  jitter
========================================================================
*SHM(0)          .shm0.     0 l    1    8   37    0.000   -0.221   0.121
+lt-martin.py.me .MRS.      1 u   21   64    1    0.097   -0.116   0.036
 ptbtime1.ptb.de .INIT.    16 u   32   64    0    0.000    0.000   0.000
+ptbtime2.ptb.de .PTB.      1 u   19   64    1  186.367  -87.007  43.283
+ptbtime3.ptb.de .PTB.      1 u   20   64    1  192.954  -90.638  24.156

Now some of the sources have already been polled, and the GPS PCI card is marked as system peer with a * at the beginning of the line. Some other sources are considered candidates for the system peer and are thus marked with a +.

Again some time later the control loop has settled:

     remote       refid    st t when poll reach   delay   offset  jitter
========================================================================
*SHM(0)          .shm0.     0 l    7    8  377    0.000    0.002   0.003
+lt-martin.py.me .MRS.      1 u   60   64  377    0.080   -0.004   0.015
+ptbtime1.ptb.de .PTB.      1 u   38   64  377   11.665    0.021  29.236
-ptbtime2.ptb.de .PTB.      1 u   60   64  377   12.184    0.312 103.407
-ptbtime3.ptb.de .PTB.      1 u   56   64  377   12.257    0.342  81.159

The GPS PCI card is still the system peer, and shows only 2 miroseconds offset, but 3 microseconds jitter, and it stays continuously at this level.

lt-martin on the LAN shows currently 4 microseconds offset, and 15 microseconds jitter.

ptbtime1 on the internet has 21 microseconds offset, which is not much if you take into account that the jitter is 29 milliseconds (!). Also the other ptbtime servers show only about 300 microseconds time offset, even though the jitter is even higher. Anyway, they are classified as falsetickers since they are worse than the other time sources.

The GPS PCI card as system peer as well as the candidates lt-martin and ptbtime1 are used to adjust the system time.

As mentioned above, the jitter from the upstream NTP servers which is much higher than the jitter from the GPS time source makes the time adjustment worse than it could be.

A quick test shows that the results are better if we use only the GPS PCI card as real time source, and append the keyword noselect to the configuration lines for the upstream NTP servers. The keyword noselect tells ntpd to poll a time source as usual, but don't consider it as a valid time source to which it can synchronize.

So in the next example the GPS PCI card connected via the SHM driver is the only real time source, and the other sources are only monitored:

     remote       refid    st t when poll reach   delay   offset  jitter
========================================================================
*SHM(0)          .shm0.     0 l    1    8  377    0.000    0.001   0.000
 lt-martin.py.me .MRS.      1 u   23   64  377    0.091   -0.021   0.016
 ptbtime1.ptb.de .PTB.      1 u   21   64  377   12.200    0.289 135.419
 ptbtime2.ptb.de .PTB.      1 u   10   64  377   12.808    0.083 241.332
 ptbtime3.ptb.de .PTB.      1 u   24   64  377   12.196    0.411 174.997

We can see here that the offset and jitter from the GPS PCI card are smaller than in the original configuration with the additional upstream servers, but the drawback here is that the other sources are really only monitored, so they can't become candidates or even system peer in case the GPS card fails. So there is more accuracy but no redundancy with this configuration.

Please note that the jitter for the NTP servers on the WAN is here even higher than before. This is accidentally, just because the network connection is currently very busy.

Normally, ntpd steps the system time at most only once, shortly after it was started, to compensate an initial large time offset.

If there are periodic “Time Reset” events then this may either be due to an excessive clock drift, or because there's another program that also fiddles with the system time, continuously or only periodically.

So in any case it is helpful to know if the system time offset increases slowly and continuously for some reason, or if it is continuously low for some time and then suddenly becomes large.

In some virtualization environments, or with a bad operating system or drivers where e.g. timer ticks get lost, the undisciplined system time might drift so much that ntpd is unable to compensate the drift, thus the time offset quickly increases and exceeds the step threshold, so that after the stepout threshold the system time is set correctly, and the game starts over again.

The only possible fix is to find out what causes the excessive clock drift, and fix this, which can be very hard to do.

Eventually there is another software running that also continuously applies corrections to the system time, and thus works against ntpd. This can be any other time synchronization software, not only an NTP client.

If the system time is corrected whenever ntpd is (re-)started, or the time offset is constantly low over a certain interval and then suddenly becomes large then probably the system time has been set by a user with administrator privileges, or by some other application running with sufficient privileges.

For example, if the system is a virtual machine then the VM may have been configured such that the system time is periodically adjusted by the virtualization system itself. In VMware there is a “VMware Tools” configuration parameter Time Sync that should be set to Off if ntpd is running in the virtual machine. If this parameter is set to On then the time in the VM is periodically set to the time of the physical host, causing a time offset that can be small or huge, depending on how good or bad the time in the virtualization system on the physical host is synchronized.

So again, to fix a problem like this you have to find out who or what sets the system time.

Unless the Windows version is very old, the Windows kernel writes a log entry to the system event log whenever the system time is changed, but of course there are no log entries if the system time is only adjusted smoothly.

The Windows event viewer application can be used to inspect such log entries. If you open the properties of such an event and look at the “details” page then you can find the numeric process ID of the process that has changed the system time.

To find out which process has that specific process ID, you can for example open a PowerShell command line window and type the command

get-proccess

which prints a current list of processes with names and IDs.

So whenever ntpd had to set the system time you find an associated system log entry with the process ID of ntpd mentioned in the event details.

Similarly, if another process has set the system time you can identify that process. For example, if the time in a VM is periodically set by the VMware tools then the process ID may belong to a process named vmwared, so you know you have to change the parameter Time Sync in the virtual machine settings and set it to Off.

Please keep in mind that a process is assigned a new process ID if another instance is started, so also if a service is restarted the new instance of the service has a different process ID that may not match the process ID found in older system events.

Also, if a program runs, sets the system time, and then terminates, it will not be shown in the process list anymore after it has terminated.

The NTP reference implementation (ntpd) uses a different approach for redundancy as usually known from other server approaches. It is not possible to configure a “master” and a “slave” time source, expect the client to use only the master, and switch to the slave only when the master becomes unavailable.

However, as explained earlier, you can simply configure several time sources at the client, so the client ntpd itself checks all servers periodically, and selects the ones to use. If one of the configured reference time sources becomes unreachable, this time source is automatically discarded by the selection algorithm. Specifically, if the system peer becomes unreachable then simply a new system peer is selected from the remaining candidates, as long as at least one candidate is available. Since the system clock adjustment has been derived from the previous system peer and the candidates, switching can be done very smoothly.

Also, if the time provided by a specific source starts to drift away from the time provided by other sources, the drifting time source becomes a falseticker and is also discarded. So even if a GPS clock is spoofed by some bad guys this can be detected and the GPS clock can be discarded and overvoted as long as there are other time sources available which provide and agree on the right time.

So this provides a high level of built-in redundancy and safety of operation.

A special case is when one or more configured time source have been reachable for some time, and then suddenly all time sources become unreachable. This means that e.g. the antenna may have been disconnected from a GPS receiver used as single reference time source, or another NTP server on the network used as single reference time source may have been shut down (powered off), or the network connection to the remote server(s) is broken.

In this case ntpd normally does not change its leap bits back to 11, and does not change its stratum back to 16. Instead, it keeps the stratum value it had before, and just starts to increase its so-called root dispersion value over time. This state is called holdover mode.

The root dispersion can be interpreted as a very coarse estimate of how much the local time has drifted away from some reference time. Normally it increases at a constant rate, but is reset to a low value whenever the time could be queried successfully from a reference time source. The value to which the root dispersion is reset depends on the precision of the reference time source. Anyway, in holdover mode there are no more successful queries to a reference time source, so the root dispersion keeps increasing continuously over time.

The root dispersion is also put into the NTP packets sent to clients, so a client can see that the root dispersion is increasing and thus the time of the server has started drifting, and each client itself can decide what to do:

If the client has another time source configured which is not drifting, it can switch to a better time source and discard the drifting NTP server.

If the client has no other time source configured then it can keep accepting the drifting server anyway, so all clients of that server will at least keep the same time. If clients would immediately discard this server even if they had no other time source available then this would be even worse since even the times on different clients would start to drift apart.

Generally, ntpd disciplines its own system time as long as the time sources are accepted, and starts sending the freewheeling system time when all configured time sources become unreachable.

On the other hand, if a refclock (e.g. a GPS receiver) provides a good, stable oscillator which is disciplined during normal operation, this oscillator usually drifts much less than e.g. the cheap crystal on an embedded microprocessor system or on a PC's mainboard. So it usually make sense in this case to let ntpd accept the GPS receiver for quite some time even after GPS reception has failed.

The parse refclock driver (driver 8) from the NTP software package which is used for Meinberg GPS receivers supports the concept of a trust time. Please note that only the parse refclock driver supports this, other refclock drivers which might be used for different GPS receivers (e.g. NMEA) don't support this.

The trust time interval starts when GPS reception suddenly fails, and only after the trust time has expired, ntpd notices that the GPS receiver has failed, and is unsynchronized. So ntpd discards the GPS time source only after the trust time interval. This feature provides a stable time for a much longer holdover interval than the freewheeling clock of an embedded microprocessor board, or a standard PC.

The trust time interval needs to be determined according to the quality of the oscillator, and the acceptable time offset due to the clock drift after reception has failed, which is a requirement of the specific application. For example, if the acceptable drift is 10 milliseconds the trust time interval can be much longer than if the acceptable drift is only 100 microseconds.

On Meinberg LANTIME devices the trust time interval as well as the stratum number in holdover mode can be configured via the web interface.

A basic question is why a client should stop accepting that server if there is no alternate time source available.

Usually, the time on a client drifts much more if the clients stops synchronizing to a dedicated NTP server since the server provides much more stable time even when in holdover.

So in most cases a better approach is to let clients still accept the time from a stable time source, but generate an alert e.g. if the time of the server starts drifting. For example, if you configure 10 days trust time on a Meinberg LANTIME, then the LANTIME can send a notification (e.g. log message, email, SNMP trap, …) when GPS reception fails, but can still provide a pretty accurate time to its clients during the trust time / holdover interval. So there's pretty much time for investigation, and to fix the reception problem.

Described above is the default behavior of the NTP reference implementation in client and server role. However, other clients, specifically simple SNTP clients may behave differently.

There are SNTP implementations our there which only look at the stratum value received from the NTP server, and expect the stratum to change back to 16 if the time sources of the server aren't synchronized anymore.

With some specific configuration you can force this behavior for the NTP server, e.g. if you configure orphan mode or the so-called local clock as fallback time source, with a stratum 15. In this case the server ntpd discards its time source when it becomes unreachable, and switches to the configured substitute time source which has stratum 15, and thus becomes stratum 15 plus 1, i.e. stratum 16.

So in special cases the trust time can be set to a very short interval only, so that the stratum changes quickly to 16, as mentioned above. However, as explained before, the basic question is whether this is the best approach for the application.

Meinberg LANTIME NTP servers provide a clustering feature which is an extension of the standard NTP funtionality.

If there are simple NTP clients which don't provide the powerful functionality of ntpd, but rely on a time source which is always available, then 2 or more LANTIMEs can be configured as cluster which share an additional, common cluster IP address. Only one of the LANTIMEs uses this IP address to provide NTP services. However, the LANTIME devices monitor each other, and if the active LANTIME fails, another one becomes the active device and starts servicing NTP requests via the shared cluster IP address.

So a client which synchronizes to the cluster IP address doesn't even notice if one device fails since the service is taken over by another device.

If a LANTIME is powered off, the time is only kept in a battery buffered RTC chip, and after power-up the initial time is read from that RTC chip. Unfortunately the high quality oscillator which often even includes an oven (OCXO) requires much more power than can be provided by a small backup battery, and thus accuracy is lost after power cycling.

This also means that after power cycling the GPS receiver claims to be not synchronized, and also ntpd has its stratum set to 16 and its leap bits to 11, so clients don't accept the ntpd running on the LANTIME as time source after power cycling until the GPS receiver is synchronized again to the satellites, so that ntpd can accept it as time source and synchronize to the GPS receiver.

— Martin Burnicki martin.burnicki@meinberg.de, last updated 2022-08-25

NTP Basics

Startup Behavior

Polling And Accepting Time Sources

Polling Delays, Jitter And Accuracy

Classification of Time Sources

System Time Correction Value

Multiple Time Sources

Polling Intervals

System Time Adjustment

Stratum Numbers

Adjusting Small Time Offsets

Handling Large Time Offsets

Stepping The System Time

Don't Change The System Time While 'ntpd' Is Running

Checking ntpd's Time Adjustment Performance

Debugging Large Time Offsets

Excessive Clock Drift

Sudden Huge Time Steps

Detecting on Windows who has set the system time

Redundancy And Safety

Holdover Behavior And Root Dispersion

LANTIME NTP Server In Holdover With Trust Time

Should An NTP Client Generally Discard A Server In Holdover Mode?

Compatibility With Dumb NTP Clients

LANTIME Clustering Feature

LANTIME Accuracy After Power Cycle