Dsp status

From MCEWiki
Related topics:
MAS

dsp_status will output a formatted summary of the PCI fibre card firmware status. It requires dsp_cmd to run. Due to a bug in dsp_status, it should not be run while acquiring data. Running it while acquiring data may result in the firmware becoming wedged.

Example

For a healthily running DSP you should get something like:

user@ubuntu:~$ dsp_status
 CODE_VERSION        : 0x550106
 STATUS              : 0x0000
  APPLICATION_RUNNING : 0
  SEND_IO_TO_HOST    : 0
  FATAL_ERROR        : 0
  FO_WRD_RCV         : 0
  HST_NFYD           : 0
  CON_DEMAND         : 0
  CON_MCE            : 0
  PCIDMA_RESTART     : 0
  PCIDMA_RESUME      : 0
  PCIDMA_RETRY       : 0
  QT_FLUSH           : 0
  RP_BUFFER_FULL     : 0
  FREEZER            : 0
  MAIN_LOOP_POLL     : 0
 MODE                : 0x000e
  MODE_APPLICATION   : 0
  MODE_MCE           : 1
  MODE_QT            : 1
  MODE_RP_BUFFER     : 1
 MAIN_LOOP           : running

Notably, most (all?) of the status bits should be zero. Mode may be 0x0008, 0x000a, 0x000c, 0x000e. The main loop should be running.

Explanation

For firmware versions U0105 and U0106, the following is presented:

  • CODE_VERSION: The firmware revision. ASCII "U" is 0x55 in hex, so you should see here 0x550105 or 0x550106
  • STATUS: The DSP status word, a bitfield made up of the following status bits:
    • APPLICATION_RUNNING: A special application is running. Never happens, so always zero. See also MODE_APPLICATION below.
    • SEND_IO_TO_HOST: Set if a packet is being transferred via DMA to the computer's RAM, or if such a packet is waiting to be transferred
    • FATAL_ERROR: Indicates an error was detected while communicating over the PCI bus. This should trigger a soft restart of the DSP program which will reset this flag.
    • FO_WRD_RCV: A packet is available in the fibre receiver FIFO.
    • HST_NFYD: The host (PC) has been notified that a packet is available, but the host has not replied yet.
    • CON_DEMAND: There is an unreceived command packet waiting from the host (PC).
    • CON_MCE: A command packet has been received from the host (PC) and is waiting for tranmission to the MCE.
    • PCIDMA_RESTART: A DMA error has occurred and the DSP needs to restart the PCI burst (DMA).
    • PCIDMA_RESUME: A DMA error has occurred and the DSP needs to retry the PCI burst (DMA).
    • PCIDMA_RETRY:
      • U0105: A DMA error has occurred and the DSP needs to retry the PCI burst (DMA). Although set, it's never checked nor cleared and is the source of the U0105 "resume bug".
      • U0106: Unused and always zero; merged with PCIDMA_RESUME to fix the "resume bug".
    • QT_FLUSH: Set when it is time to inform the host (PC) of the current buffer position.
    • RP_BUFFER_FULL: Set if the reply buffer is full (subsequent MCE reply packets will be discarded).
    • FREEZER: DSP is "frozen": all packets are simply discarded, and the main loop just idles.
    • MAIN_LOOP_POLL: Set if the main loop isn't running (ie. if the DSP has locked up). See also MAIN_LOOP below.
  • MODE: The mode of operation of the DSP, a bitfield made up of the following mode bits:
    • MODE_APPLICATION: A special application (downloaded over the PCI bus) is scheduled for execution. See also APPLICATION_RUNNING above. "Special applications" are a legacy from the original ARC code which aren't used on modern MCE systems. As a result, this (and APPLICATION_RUNNING above) should always be zero.
    • MODE_MCE: Packets sent by the MCE are being processed. Although off at start-up, sending a command to the MCE will turn this on. A dsp_reset will turn it back off.
    • MODE_QT: Quiet transfer mode for data packets is enabled (QT mode). On by default. A dsp_reset will turn it off, although acquiring data normally will turn it back on.
    • MODE_RP_BUFFER: Quiet transfer mode for command reply packets is enabled (QT-RP mode). On by default. An unexpected restart of the DSP program will turn it off (see, for instance, FATAL_ERROR above). A dsp_reset will turn it back on.
  • MAIN_LOOP: Essentially a re-hash of the MAIN_LOOP_POLL bit: this will be "running" if the main loop is running (MAIN_LOOP_POLL = 0) or "stalled" if it is not (MAIN_LOOP_POLL = 1). The MAIN_LOOP_POLL bit is set after printing the MAIN_LOOP_POLL line above under STATUS, so it is possible to get a "stalled" here even if MAIN_LOOP_POLL is reported to be zero. In that case, the DSP is actually stalled, and a second dsp_status should report MAIN_LOOP_POLL = 1.