DTT now can get testpoint data, and the synchronization errors have gone away.
I am not sure why the sync errors went away, i did install ntp while trying to fix something else so maybe that updated our clock to be close enough to the cymac.
The testpoint problem was fixed with the help of Chris Wipf. There was some trouble with the DTT RPC communication with the awgtpman processes running on the cymac. The command "diag -i" tells us something about what DTT thinks about the network configuration. It was giving this before:
controls@gaston:~$ diag -i
Diagnostics configuration:
awg 5 0 127.0.1.1 822095877 1 10.0.5.11
awg 7 0 127.0.1.1 822095879 1 10.0.5.11
awg 8 0 127.0.1.1 822095880 1 10.0.5.11
nds * * 10.0.5.11 8088 * 127.0.0.1
tp 5 0 127.0.1.1 822091781 1 10.0.5.11
tp 7 0 127.0.1.1 822091783 1 10.0.5.11
tp 8 0 127.0.1.1 822091784 1 10.0.5.11
The problem turned out to be all those 127.0.1.1 lines. My guess is that the problem is new because we have our framebuilder and nds running on the same machine as our frontend. So when the nds tries to get the ip address of the frontend (where awgtpman is running), the host lookup for cymac1 returns 127.0.1.1, instead of the correct ip address.
The fix is a bit of a hack, but I edited the hosts file (of the cymac) so that cymac1 points to 10.0.5.11, instead of 127.0.1.1. Now diag -i gives:
controls@gaston:~$ diag -i
Diagnostics configuration:
awg 5 0 10.0.5.11 822095877 1 10.0.5.11
awg 7 0 10.0.5.11 822095879 1 10.0.5.11
awg 8 0 10.0.5.11 822095880 1 10.0.5.11
nds * * 10.0.5.11 8088 * 127.0.0.1
tp 5 0 10.0.5.11 822091781 1 10.0.5.11
tp 7 0 10.0.5.11 822091783 1 10.0.5.11
tp 8 0 10.0.5.11 822091784 1 10.0.5.11
With this configuration, DTT can get testpoints.
I am not sure if there is a better solution than messing with the hosts file. It is not great because it will have to change if the network is reconfigured. |