Troubleshoot a slow or stalling plan

Problem

A plan gets stuck or takes an unexpectedly long time to complete. We want to discover where the problem is.

Approach

Print to the screen each time a new messsage from the plan in processed by the RunEngine. Then we will see the last message received before the plan freezes and/or slows.

Example Solution

The RunEngine provides a special hook, a function that is calls every time it is about to process the next message in the plan.

RE.msg_hook = print

Now the RunEngine will call print(msg) before processing each msg in the plan. Execute the plan that is causing the issue and watch the output. Suppose we are running a count that freezes up while triggering a detector. That would look like this:

In [7]: RE(count([det]))
stage: (<bluesky.examples.SynGauss object at 0x10f1bfb38>), (), {}
open_run: (None), (), {'plan_args': {'num': 1, 'detectors': ['<bluesky.examples.SynGauss object at 0x10f1bfb38>']}, 'detectors': ['det'], 'plan_name': 'count'}
checkpoint: (None), (), {}
trigger: (<bluesky.examples.SynGauss object at 0x10f1bfb38>), (), {'group': 'trigger-ff93f4'}
wait: (None), (), {'group': 'trigger-ff93f4'}

The last message is is a ‘wait’ message, and it waiting for the ‘trigger’ just above it to complete. If the plan freezes at this point, a likely culprit is the triggering mechanism of the detector. In this example, we can see the detector in question is a simulated detector, bluesky.examples.SynGauss. From here we would investigate whether the hardware was behaving properly and whether its triggering behavior was programmed properly on the Python side.

When finished troubleshooting, set RE.msg_hook = None.

We used print for simple debugging. Any user-defined function that accepts a bluesky.Msg namedtuple as its argument could also be used.