Keith Thorne provided his eyes on the situation today and had some suggestions that might have helped things
Reorder ini file list in master file. Apparently the EDCU.ini file (C0EDCU.ini in our case), which describes EPICS subscriptions to be recorded by the daq, now has to be specified *after* all other front end ini files. It's unclear why, but it has something to do with RTS 2.8 which changed all slow channels to be transported over the mx network. This alone did not fix the problem, though.
Increase second trend frame size. Interestingly, this might have been the key. The second trend frame size was increased to 600 seconds:
start trender 600 60;
The two numbers are the lengths in seconds for the second and minute trends respectively. They had been set to "60 60", but Keith suggested that longer second trend frames are better, for whatever reason. It seems he may be right, given that daqd has been running and writing full and trend frames for 1.5 hours now without issue.
As I'm writing this, though, the daqd just crashed again. I note, though, that it's right after the hour, and immediately following writing out a one hour minute trend file. We've been seeing these hour, on the hour, crashes of daqd for quite a while now. So maybe this is nothing new. I've actually been wondering if the hourly daqd crashes were associated with writing out the minute trend frames, and I think we might have more evidence to point to that.
If increasing the size of the second trend frames from 60 seconds (35M) to 600 seconds (70M) made a difference in stability, could there be an issue since writing out files that are smaller than some value? The full frames are 60M, and the minute trends are 35M.