Quote: |
I tried a manual test of the new user space model. Since this is a user space process running it should have no affect on the rest of the front end system (which it didn't):
- Manually started the c1dnn EPICS IOC:
- Tried running the model user-space process directly:
Unfortunately, the process died with an "ADC TIMEOUT" error. I'm investigating why.
Once we confirm the model runs, we'll add the appropriate SHMEM IPC connections to connect it to the c1lsc model.
|
I tried moving the model to c1ioo, where there are plenty of free cores sitting idle, and the model seems runs fine. I think the problem was just CPU contention on the c1lsc machine, where there were only two free cores and the kernel was using both for all the rest of the normal user space processes.
So there are two options:
- Use cpuset on c1lsc to tell the kernel to remove all other processes from CPU6 and save it just for the c1dnn model. This should not have any impact on the running of c1lsc, since that's exactly what would be happening if we were running the model in kernel space (e.g. isolating the core for the front end model). The auxilliary support user space processes (epics seq/ioc, awgtpman) should all run fine on CPU0, since that's what usually happens. Linux is only using the additional core since it's there. We don't have much experience with cpuset yet, though, so more offline testing will be required first.
- Run the model on c1ioo and ship the needed signals to/from c1lsc via PCIe dolphin. This is potentially slightly more invasive of a change, and would put more work on the dolphin network, but it should be able to handle it.
I'm going to start testing cpuset offline to figure out exactly what would need to be done. |