PCI card bug list

From MCEWiki

This is a list of things to fix in subsequent releases.

U0106 bugs

  • A race condition in dsp_status prohibits it from running at any time other than when the main loop is idle. This could be fixed by moving MAIN_LOOP_POLL somewhere less critical. Also affects U0105.

U0105 bugs

  • STATUS[PCIDMA_RETRY] flag is set but not handled or cleared. Fixed in U0106.
  • STATUS[HST_NFYD] flag is half-used, not cleared. Innocuous.
  • Asynchronous MCE reply NFY does not seem to play well with QTS TAIL updates. I.e. some part of the system breaks when a QTS is issued between a CON and the associated NFY REP.
  • Does CON_TRANSMIT need to disable PCI interrupts while it is sending the command to MCE? What is the MCE input time-out?

U0103 bugs

  • Data packets can fail to HST properly. This was observed in MAS driver, and only recently in DAS. In one instance, DAS failed with a timeout after receiving 1381/2000 frames and could not take data (though other mce commands worked) until the driver was reloaded. This looks to me like lost HST replies. The NFY comes through and the HST gets sent but then everything hangs.
  • After some failures (card lookup?), one-behind state emerges. "Reset" doesn't help, nor does "reset_mce". A command must be issued (which returns as unmatched), and then reset dsp will fix the problem. I'm not even sure where the packet is trapped.

A1.5 / U0103 design issues

  • PCI reset doesn't empty the FIFO! FIFO is cleared on preamble error. But that means you have will definitely lose an MCE packet if buffer gets out of sync. Reset is 2 ms long; perhaps this is a way of waiting for an entire reply to have been transmitted? A PCI reset routine should clear the FIFO. Either that or a separate vector command, but that seems like overkill.
  • Since packet choke is on after a reset, any words in fifo get discarded without a preamble check; this means that FIFO can't be reset before the first MCE command. This is silly; reset should clear the fifo!