Linux Driver Package - Troubleshooting
This page contains some hints that maybe helpful to find out why something doesn't work as expected.
Driver Installed Properly, But System Time Isn't Synchronized
If ntpd
doesn't see the device, or doesn't accept it as reference time source, but mbgstatus
can access the device successfully, some possible reasons are:
- The device is not synchronized, and thus not used by
ntpd
. In this casembgstatus
reports that the device is not synchronized.
mbgsvcd
is not running, thus no time stamps are fed intontpd
's shared memory driver, and thusntpd
doesn't get any reference time. In this case nombgsvcd
process is listed in the output of theps ax
command.
AppArmor
orSELinux
are installed, active, and preventingntpd
from accessing the shared memory segment. In this case there should be some log messages fromntpd
in the syslog sayingPermisson denied
or so. See theREADME
file of the driver package for hints how to fix this.
ntpd
has been compiled without support for the Shared Memory Driver. In this case there should be some log messages from ntpd in the syslog saying that driver 28 is unknown, or so.
Kernel Driver Fails To Be Loaded
If the driver software for Linux has been compiled and installed correctly but the kernel driver doesn't load, or a device is not detected even though it is obviously installed, the following instructions may help to determine the reason for this problem. In such case even a command like mbgstatus
only reports something like “No Device found.”.
Kernel Module Not Found
If the kernel module file mbgclock.ko
can't be found when the command modprobe mbgclock
is executed, possible reasons are:
- The driver hasn't been installed, yet, i.e. the
make install
command hasn't been executed.
- The driver has been built and installed, but for a different kernel version.
The 2nd case can occur if the kernel has been updated to a new version, but the system hasn't been rebooted since, so an older kernel is still running when the driver package is built. The kernel module is installed in a subdirectory associated with the version of the running kernel, but after reboot the kernel version has changed, and thus module file is in a different directory than expected.
The following commands can be used to check this:
uname -r find /lib/modules/ -name mbgclock.ko
The output may look similar to:
#> uname -r 3.16.7-35-desktop #> find /lib/modules -name mbgclock.ko /lib/modules/3.16.7-24-desktop/extra/mbgclock.ko /lib/modules/3.16.7-35-desktop/extra/mbgclock.ko /lib/modules/3.16.7-29-desktop/extra/mbgclock.ko /lib/modules/3.16.7-21-desktop/extra/mbgclock.ko /lib/modules/3.16.7-7-desktop/extra/mbgclock.ko
In the case above there have been several kernel updates, and there is one instance of the kernel module per kernel version. This is OK, but it is important that there is also a module listed for the running kernel 3.16.7-35-desktop
, which is /lib/modules/3.16.7-35-desktop/extra/mbgclock.ko
in this case.
Kernel Module Found But Not Loaded
The Linux kernel can optionally be built with a requirement that out-of-tree modules are only loaded if they have been built with the same compiler version as the target kernel. In this case an extra module does not load if it has been built using the proper kernel headers, but a different compiler version.
This has been observed e.g. on RedHat Enterprise Linux (RHEL) 5.3
where obviously the kernel was built with gcc-4.1
and the constraint that the compiler version must be the same for modules built out-of-tree.
A customer optionally installed a newer version of gcc-4.3
on this system. The original gcc-4.1
was still there but gcc-4.3
became the default compiler, so an extra kernel module which was properly built with the new default compiler failed to be loaded with these messages reported by dmesg
:
mbgclock: version magic '2.6.18-128.1.10.el5 SMP mod_unload gcc-4.3' should be '2.6.18-128.1.10.el5 SMP mod_unload gcc-4.1'
So obviously the compiler version is part of the version magic string (in most cases it is not), and the module failed to be loaded because it was built with gcc-4.3
while the kernel itself had been built with gcc-4.1
.
A fix in this case is to make sure the expected compiler version is installed:
# ls -l /usr/bin/gcc* -rwxr-xr-x 2 root root 216016 Jan 21 14:54 /usr/bin/gcc -rwxr-xr-x 2 root root 97728 Jan 9 2007 /usr/bin/gcc41 -rwxr-xr-x 2 root root 234936 Jan 21 13:29 /usr/bin/gcc43
The driver package containing the extra kernel module can then be built with specification of the appropriate compiler version:
make clean; make CC=gcc41 && make install
No Device Found
If the kernel module tries to load normally, but no PCI card or USB device is found even though it is obviously installed, some debugging is required to find out what might be the reason.
First step is to find out if the device is recognized by the kernel. In case of a PCI card:
lspci -d 1360:*
should list the device, for example:
#> lspci -d 1360:* 01:00.0 System peripheral: Meinberg Funkuhren GPS180PEX GPS Receiver (PCI Express) (rev 01)
In case of an USB device this command can be used:
lsusb -d 1938:
and if a device is detected the output should look similar to:
#> lsusb -d 1938: Bus 003 Device 023: ID 1938:0301
If no device is listed by the commands above then the kernel module can't find the device, either.
The reason may be that the USB port isn't connected, or the PCI slot is disabled, so it's definitely worth trying a different USB port or PCI slot.
Please note that on some machines (e.g. some Dell servers) individual PCI slots can be disabled in the BIOS setup, so you may enter the BIOS setup check for such setting, and enable the affected slot, if it is disabled.
If the device is listed this just means that is is recognized according to it's vendor and device ID. There is still no guarantee that the driver is capable of talking to the microcontroller on the device.
So if the device is listed, but the kernel module doesn't detect it, the kernel module may be recompiled with DEBUG enabled to provide some more detailed information. Assuming the current directory is the base directory of the driver package, change into the directory of the kernel driver
cd mbgclock/
and run the following command to rebuild only the kernel module with debug information:
cd mbgclock/ make clean; make DEBUG_FULL=1 # was DEBUG=15 for driver packages before 4.2.0
The kernel module can be built with a number of debug options associated with different topics, which are all listed if make
is run with DEBUG_FULL=1
, e.g.:
Calling kernel build system to make "clean" Kernel build system finished making "clean" DEBUG_FULL:1 REPORT_IO_ERRORS: [1] REPORT_CFG: [1] REPORT_CFG_DETAILS: [1] DEBUG_RSRC: [1] DEBUG_SERNUM: [1] DEBUG_ACCESS_TIMING: [1] DEBUG_IO_TIMING: [1] DEBUG_IO: [1] DEBUG_USB_IO: [1] DEBUG_DRVR: [1] DEBUG_DEV_INIT: [1] DEBUG_IOCTL: [1] Calling kernel build system to make "modules" Kernel build system finished making "modules"
When the kernel driver is built this way, loading the module produces a huge amount of messages in the output of dmesg
.
To enable only a limited subset of the full debug stuff, make can also be called with the interesting options, e.g.:
make clean; make REPORT_CFG=1 REPORT_CFG_DETAILS=1 DEBUG_RSRC=1
Once the kernel module has been compiled with the desired debug information, the freshly compiled module can be loaded without having to install it first:
insmod ./mbgclock.ko
After you have tried to load the module, a huge number of debug messages is printed by the dmesg
command,
and the output can easily be copied to e text file, e.g.:
dmesg | tee dmesg-output.txt
displays the messages and also writes them to the file dmesg-output.txt
. The messages can be helpful to
identify what's going wrong when the driver probes the device, and the result may be that the device is faulty.
E.g., if the microprocessor on the device is dead, the device doesn't reply when the driver tries to read some data.
In this case the debug output might contain a message like Failed to read firmware ID
or so.
Please contact Meinberg support at techsupport@meinberg.de and provide some information on the problem.
If possible, append the text file with the dmesg
output to your email, preferably archived,
e.g. as .tar.gz or .zip file.
Don't run make install
for the DEBUG version of the kernel module. Due to the debug messages the module is pretty slow, system timekeeping is significantly worse, and the system log is quickly filled with debug messages. So after debugging has finished, only the kernel module or simply the whole package can be recompiled and installed without DEBUG by running the following command in the base directory of the driver package:
make clean; make && make install
— Martin Burnicki martin.burnicki@meinberg.de, last updated 2022-01-13