Difference between revisions of "MAS malfunction diagnosis"
(→Is the MAS driver loaded?) |
|||
(3 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
− | Here's how to determine what is wrong with | + | Here's how to determine what is wrong with MAS. Ask yourself each of the following questions. "Yes" answers are good, "No" answers are bad. |
− | = Does the | + | = Does the PC boot? = |
This can be tested by turning on the machine. If the machine boots relatively cleanly, then great. This isn't your problem. | This can be tested by turning on the machine. If the machine boots relatively cleanly, then great. This isn't your problem. | ||
Line 78: | Line 78: | ||
= Are the MAS device nodes present? = | = Are the MAS device nodes present? = | ||
+ | |||
+ | Run: | ||
+ | ls -l /dev/mce_* | ||
+ | |||
+ | You should see something like | ||
+ | crw-rw-r-- 1 root mce 252, 0 2008-09-03 00:05 /dev/mce_cmd0 | ||
+ | crw-rw-r-- 1 root mce 251, 0 2008-09-03 00:05 /dev/mce_data0 | ||
+ | crw-rw-r-- 1 root mce 253, 0 2008-09-03 00:05 /dev/mce_dsp0 | ||
+ | |||
+ | If you do not, run | ||
+ | sudo mas_mknodes | ||
+ | |||
+ | and check again. If the mknodes script fails, it is probably because the driver is not loaded. | ||
+ | |||
+ | = Is the MAS logging server running? = | ||
+ | |||
+ | = Can the system communicate with the PCI card? = | ||
+ | |||
+ | = Can the system communicate with the MCE? = | ||
+ | |||
+ | [[Category:MAS]] |
Latest revision as of 16:24, 30 August 2016
Here's how to determine what is wrong with MAS. Ask yourself each of the following questions. "Yes" answers are good, "No" answers are bad.
Contents
Does the PC boot?
This can be tested by turning on the machine. If the machine boots relatively cleanly, then great. This isn't your problem.
If you get a kernel panic or some other serious error during boot, try each of the following:
- power cycle the computer, leaving it powered down for at least 5 seconds. This is to reset the state of the PCI card in case it is angry with the PC.
- boot using the Ubuntu default kernel, instead of the patched one. This is accomplished by pressing "Escape" at the beginning of the boot sequence to enter the boot-loader menu (Grub), then selecting "Ubuntu, kernel 2.6.15-26-server" from the menu.
If the power cycle seems to fix the problem, then the PCI card was angry with the PC. If this problem recurs, make sure that the driver is running in quiet mode, and start to be suspicious about your hardware.
If the bigphys kernel does not boot but the Ubuntu default kernel does, then there may be some problems with the boot options; see the section on "Boot Menu" in MAS OS setup .
Is the bigphys kernel loaded?
From a terminal, run the command
uname -r
You should receive the response
2.6.15.7-bigphys
If you do not, the bigphys kernel is not being loaded. See the section on "Boot Menu" in MAS OS setup.
Next, check the size of the bigphys allocation:
cat /proc/bigphysarea
This should return a message like this:
Big physical area, size 32768 kB free list: used list: number of blocks: 1 1 size of largest block: 23000 kB 9768 kB total: 23000 kB 9768 kB
If it doesn't, the "bigphysarea" boot option has probably not been configured. See "Boot Menu" in MAS OS setup. The "Big physical area, size" number doesn't need to be larger than about 10000 kB.
Is the MAS driver loaded?
The easiest way to check for the driver is
cat /proc/mce_dsp
If the file doesn't exist, the driver is not loaded. If the driver is loaded, a bunch of diagnostic messages will be produced. A working system will look something like this:
mce_dsp driver version gamow/mas:196 fakemce: no realtime: no bigphys: yes data buffer: virtual: 0xc1833000 bus: 0x01833000 count: 4882 head: 1000 tail: 1000 drops: 0 size: 0x800 data: 0x4d0 mode: quiet mode mce commander: state: idle dsp commander: state: idle dsp pci registers: hstr: 0x0003 hctr: 0x0900
The most important things in this list are:
fakemce: no bigphys: yes mode: quiet mode hstr: 0x0003 hctr: 0x0900
If bigphys says "no", the driver has been compiled without bigphysarea support. Go re-compile and reinstall the driver.
If the product of "count" and "size" is much smaller than about 10M, the driver data buffer size is too small. You should be able to issue MCE commands cleanly, but multi-frame acquisitions will likely fail.
If "mode" is not "quiet mode", you may be using out-dated PCI card firmware. Inspect /var/log/messages for messages from the driver (they are marked "mceds"; you can grep for "dsp_query_version"). The "PCI card DSP code version" should be at least U0104.
Are the MAS device nodes present?
Run:
ls -l /dev/mce_*
You should see something like
crw-rw-r-- 1 root mce 252, 0 2008-09-03 00:05 /dev/mce_cmd0 crw-rw-r-- 1 root mce 251, 0 2008-09-03 00:05 /dev/mce_data0 crw-rw-r-- 1 root mce 253, 0 2008-09-03 00:05 /dev/mce_dsp0
If you do not, run
sudo mas_mknodes
and check again. If the mknodes script fails, it is probably because the driver is not loaded.