Since I started to use the #debian kernel instead of an #armbian legacy kernel on my #RockPro64 I had a few crashes. Most of them seemed to have been related to #sata, because they stopped after I exchanged the sata pcie with another one of the same type.
I had another #oops afterwards and decided I should look for a #watchdog to reboot the system in case of trouble.
After reading a bit about watchdogd the most simple solution I found is:
root@TEST:~# cat /etc/cron.d/watchdog @reboot root wdctl -s 180 * * * * * root echo "1" > /dev/watchdog
I'm testing it on a non-productive board and it seems to be good. It works for a forced oops echo c > /dev/sysrq-trigger
and if I stop cron.
But it doesn't work in the state after a simple halt
: the system tries to start and hangs after showing the first line of u-boot output.
Follow-up to this note:
At one of the times the current
#Armbian #linux kernel didn't crash, but booted on my #RockPro64 I installed #GotoSocial and yes: no more strange error messages running it. The only time I remember that I updated a linux kernel to make an application work.
Preparations to get the production system running with the newer kernel:
- in case of problems I'd need to know my way back to boot the old kernel: u-boot recap and learning
- test suspected problems with the #sata #pcie ctrl beforehand (get a second Pine64 sata ctrl for the test system)
Meanwhile Dragan had patched the device tree and asked to check on a few reboots whether this would make a difference or cause any regressions.
Further down the #RabbitHole the tunnel forks…