kb:time_sync:time_synchronization_accuracy_with_ntp

Time Synchronization Accuracy With NTP

Of course the time accuracy you can yield at the client side depends on the accuracy of the available reference time source, but it also depends strongly on a number of preconditions at the client side, e.g. the operating system type (Windows, Linux, …) and particular version of that OS, the CPU type (good TSC support, or not), the chipset on the mainboard, and the quality of the oscillator on the mainboard which determines the general stability of the system time. In addition there are hardware limitations introduced by the concept of the system busses, e.g. PCI or USB, interrupt latencies, etc.

Linux, FreeBSD, etc. provide a good programming interface to read the current system time accurately, and to apply adjustments to the system time properly. Windows is much worse than this.

There are Intel x86-compatible CPU types providing reliable TSC counters which can be used to measure and thus at least partially compensate some latencies. On other CPU types the TSCs of different CPU cores may not be not synchronized to each other, or the TSC clock frequency also changes when the CPU clock changes due to power saving. Non-Intel CPUs may not even provide a high resolution counter similar to the TSC, so with these CPU types it's much harder to achieve accurate time synchronization.

The frequency of the oscillator on the particular mainboard has an individual mean offset from its nominal frequency, and in addition the frequency varies more or less with changes of the ambient temperature inside or outside the PC housing. This can be due to changes in the CPU load causing more or less power consumption and associated heat inside the PC housing, an air condition kicking in and out in a server room, or just the daily temperature variations between day and night.

Since the time drift depends on the frequency variations, the control loop of a good time synchronization software tries to determine and compensate the frequency of the oscillator, and steers the oscillator so that a time offset doesn't even evolve. However, how good this works depends on the factors mentioned above.

Whenever a network packet is sent from one application running on one node to another application running on a different node the network packet has to traverse a path which include quite a number of unknown delays or latencies:

Network Latencies Graph

All the yellow lines indicate unknown processing delays. If the packet has to go through a switch, but the output port is just busy with a different packet (red lines) then the packet is delayed. Also different link speeds on the 2 involved switch ports cause an asymmetry of the packet delay which results in a time offset. Only the cable delays (green lines) are really constant.

Hardware time stamping of network packets can be used to measure and thus compensate the variable packet delays. However, this is mainly used by the PTP protocol and requires hardware support for time stamping specific packets, and it has to be done on every network port blue flashes). Unfortunately there are only switches available which support this for PTP, but there are no switches which support this for NTP. On the other hand, even if hardware time stamping of NTP packets was supported the current version of the NTP protocol doesn't even provide a proper way to forward the measured delays.

The reference implementation of the NTP protocol, ntpd, can yield pretty good accuracy anyway with standard hardware (no time stamping support) due to its built-in filter algorithms.

However, there are still possible time offset errors due to network asymmetries. See:

Assuming a time source provides accurate time, the accuray you can yield when synchronizing your system time on a specific client hardware depends mostly on the implementation of the software on the system to be synchronized. The only things an NTP server has to do when it receives an NTP request packet from a client is to:

  • Add a time stamp to the packet when the request packet comes in
  • Eventually check if some restrictions apply, and prepare the packet as reply to the client
  • Add another time stamp to the reply packet immediately before the packet is sent

On the other hand, the task of an NTP client is to

  • Prepare a request packet and add a time stamp of its own system time immediately before the packet is sent to the server
  • Wait for the reply packet from the server, and get another time stamp of its own system time as soon as the packet arrives
  • Evaluate the 4 time stamps associated with the reply packet to estimate the packet delay and determine its own time offset from the server
  • Adjust its own system time so that the time offset is minimized

So obviously it's the task of the client software to determine and compensate the packet delays on the network, eventually filter out outliers if the a request and/or reply packet has been delayed more than usually, and adjust its own system time more or less accurately. The server can't do this, it can just provide accurate time.

A good client implementation like ntpd evaluates the time stamps from a series of subsequent packet exchanges, tries to filter out network delays and jitter, and takes much care to adjust the system time as smoothly as possible, unnoticeably for applications.

A simple client implementation could just pick up the time from the server reply and just set the system time whenever it has queried the time from a server. This is what some SNTP (Simple NTP) implementations are doing, and this obviously only yields less accuracy than a more sophisticated implementation.

Normally the time synchronization software shows the time offset it has determined. The precision and accuracy of this time offset when the system time control loop has settled depends on many conditions: the accuracy and precision provided by the time source (NTP server, GPS receiver, or whatever), on the way the time source is accessible (via the network, via USB, via PCI, via 1 PPS signal, …), on the type and version of the client operating system (Linux, Windows, …) and the version of the OS, as well as the client hardware (CPU, chipset, physical or virtual machine), and in case of a VM on the type, version, and configuration of the virtualization software (VMware, XEN, Hyper-V, …).

For any of the combinations above the computed time offset may be correct, or be more or less off due to some systematic errors. E.g., an asymmetric network connection (e.g. an ADSL line) can cause a real time offset of a few milliseconds which the client software can't determine, and thus such systematic offsets can't be compensated. In case of a 1 PPS hardware signal a time offset error up to a few 10s of microseconds can be caused by interrupt latency, depending on the system characteristics.

The example output below is from an ntpq -p command on a Linux machine synchronized with a Meinberg GPS180PEX PCI card as time reference listed as SHM(0):

     remote      refid     st t when poll reach   delay   offset  jitter
========================================================================
*SHM(0)          .shm0.     0 l    3   16  377    0.000   -0.000   0.001
 lt-martin.py.me .MRS.      1 u   15   64  377    0.067   -0.004   0.014
 ntp-master-1.py .PPS.      1 u    7   64  377    0.201   -0.010   0.151
 ptbtime1.ptb.de .PTB.      1 u  109  128  377   12.249    0.023   0.235
 ptbtime2.ptb.de .PTB.      1 u   35  128  377   11.738    0.247   8.509
 ptbtime3.ptb.de .PTB.      1 u   16  128  377   11.889    0.266   8.491
 host-24-56-178- .ACTS.     1 u  148  128  376  146.136   -0.959   0.149

All sources except SHM(0) have been configured with the noselect keyword, so ntpd only polls them and computes the estimated time offset, but doesn't use them for synchronization.

lt-martin and ntp-master-1 are GPS controlled Meinberg LANTIME devices on the same subnet, where lt-martin is closer (less switches in the path) and thus shows less jitter but a little more accuracy than ntp-master-1.

The 3 PTB servers and the host with refid ACTS are only reachable across the WAN/internet, but the determined time offset is still below 1 ms even though the packet delay to the latter is 146 ms. These are quite good results for a setup without special switches and routers. However, this level of accuracy can't be guaranteed since it depends on a number of factors as explained above.

Here is a graph of the loopstats file recorded over a few hours:

Loopstats Linux with GPS PCI card - Detailed

Though the graph above looks very nice, it has been generated of only a small section of a loopstats file which has been recorded over about 4 days. The full graph is shown below, and in the frequency offset curve you can see the cyclic daily frequency changes due to cyclic daily temperature changes, resulting in a varying time offset which ntpd tries to compensate:

Loopstats Linux with GPC PCI card - 4 Days

This shows how the stability / temperature sensitivity of the client affects the accuracy of the system time.

Time synchronization under Windows in general can be pretty tricky, mostly due to limitations in individual Windows versions. An overview of time synchronization support in Windows can be found in the article NTP And Windows History.

Specifically with ntpd the control loop seems to become less stable if the polling interval increases above 7 (128 s). This can be seen in the graph below. Whenever the poll number increases the time accuracy increases, too:

Loopstats Windows XP - No Poll Limit

On the other hand, if the polling interval is clamped to 6 (64 s), time synchronization becomes much smoother:

Loopstats Windows XP - Poll Clamped to 6

Windows Vista and Windows 7 contain a bug where small time corrections are not accepted by the Windows system timekeeping:

ntpd-4.2.7p349 (a development version) and later versions including the subsequent release version ntpd-4.2.8* contain a workaround for this problem.

The next graph shows results from a Windows 7 machine with unlimited polling interval:

Loopstats Windows 7 - Poll 4-max

Again, you can still see this is not very stable for higher polling intervals. On the other hand, the workaround for the 16 ms bug in Windows can work only properly if the polling interval is at least 6 (64 s), so it doesn't make much sense to reduce the polling interval below 6. As the next graph shows the time discipline under Windows 7 becomes only smooth after the polling interval has ramped up to 6:

Loopstats Windows 7 - Poll 4-6

So the suggested configuration of upstream NTP servers for ntpd under Windows includes lines like this:

server aa.bb.cc.dd iburst minpoll 6 maxpoll 6

where aa.bb.cc.dd has to be replaced by the real IP address or hostname of the external NTP server.

All loopstats displayed above have been recorded using a Linux machine with GPS-controlled ntpd on the local subnet.

Starting with Windows 8 a new Windows API has been introduced which allows applications to read high resolution system time stamps, and thus yield better precision.

The graph below shows the loopstats recorded by ntpd 4.2.8p8 under Windows 10 over about 6 days, with the polling interval fixed to 6:

Loopstats Windows 10 - 6 days (with reboot)

Unfortunately the machine has been rebooted during the experiment, but we can anyway see the daily frequency changes of the oscillator on the mainboard due to temperature changes, which have to be compensated by ntpd.

On the other hand, the same experiment on a different PC, run at the same time, in the same room, also with ntpd 4.2.8p8 under Windows 10, doesn't show these daily temperature cycles:

Loopstats Windows 10 - 4 days

This shows once more the importance of the stability of the oscillator assembled on the PC's mainboard.

Again, all loopstats displayed above have been recorded using a Linux machine with GPS-controlled ntpd on the local subnet.

Timekeeping in general can be pretty tricky, and the more accuracy and precision you need, the more effort has to be taken to achieve it, and the more limiting conditions will come into effect. Network characteristics can degrade the resulting accuracy, and also virtualization is one more condition affecting the time synchronization performance.

See:


Martin Burnicki martin.burnicki@meinberg.de 2019-04-04

  • kb/time_sync/time_synchronization_accuracy_with_ntp.txt
  • Last modified: 2021-03-12 16:27
  • by 127.0.0.1