Determining if os/firmware crashes remotely.

siftd106siftd106 United States

Is there currently a way to determine if the os/firmware crashes and reboots.
I would like to keep track of how often this happens for our fleet of devices, but do not currently know of a way to determine if the boot was a fresh power up, or a reboot due to crash, or a planned reboot.
I am imagining something to check during the boot, that I can then queue up a message to be sent to our mqtt server so I can report on this. It will help us identify when we implement new features, and the new features have some level of bug in them.

Thanks in advance!


  • ulsoulso Stockholm
    edited March 5

    Wouldn't this be a perfect use for the MQTT "will" message?
    I haven't tried using it my self yet, but maybe you could try this

    mos config-set mqtt.will_message="Connection lost"

    and see if you get that MQTT message when the device reboots.

    I may be wrong but I think the the "will" message setting tells the MQTT server what message you want to be sent when the connection to the device is lost.

  • siftd106siftd106 United States

    Very interesting, I like the out of the box thinking on this one.
    We are using AWS IoT for out MQTT implementation, and I don't think they offer a "will" message.
    They may offer some other form or getting the same info, like some type of connection info that we query on a consistent basis from the backend to see when it disconnects, or stops "ping"ing.
    I'm not sure that knowing when a device disconnects will get me the info I need, since a disconnect doesn't mean a reboot/crash. Our devices are often installed in locations that have very poor wifi signals, and connections are lost, re-acquired often.
    I also am currently getting a message everytime the devices acquire an ip address, and send that info along via mqtt. This lets me track how often the network re-connects, which is another very useful bit of information.

    I wonder if I can hook into the system time to achieve this info. Does the system type get reset to epoch every time a crash happens. I know that it will maintain the system time somewhat accurately on a planned reboot, but I am not sure about after a crash.

    Does anyone know if there is anything different between a reboot, crash, and a fresh boot that I may be able to look at to determine if when it is a fresh boot vs a crash?


  • ulsoulso Stockholm

    I know that for instance the CC3220 SDK has a function called PRCMSysResetCauseGet() that you can use to find out the reset reason. The ESP32 seems to have something similar in the rtc_get_reset_reason() function that is discussed in the following thread

    If you happen to use a CC3220 or an ESP32 you might try to call one of those functions as soon as your fw is starting, to see if there's maybe reason to take special action.

  • siftd106siftd106 United States

    Ohhh nice find. We are using the ESP32. I'll check this out to see how the reset reason differs between crashes, powerups, and system reboots.


