I spent the past few days trying to troubleshoot errors that occurred while creating the basic gym environment for thermal control of the vacuum can and finally got it to work. I also ran some tests on a jupyter notebook and the environment seemed to function as expected.
Features of this environment:
1. The state is a numpy array containing the current temperature of the can and the ambient temperature, initially set to a random value between 15 and 30 C, 20 C, respectively.
2. The ambient temperature is currently modelled as a sinusoidal function oscillating with an amplitude of 5 C about 20 C with a time period of 6 hours.
3. Allowed actions are integer-valued heating powers between 0 and 20 W.
4. Each time-step lasts 10 seconds, and a single action is applied for this duration. The state updates itself after one time-step using scipy.integrate.odeint to calculate evolution of the vacuum can temperature. Heat conduction through the foam and heating influence the evolution. The heat conduction was modelled as per previous simulations and calculations.
5. A single episode of the game runs while the temperature of the can is within 15 and 60 C.
1. gym.make('VacCan-v0') ran without any unusual error.
2. state, action, step() resulted in output as predicted.
3. Multiple iterations of step(), with zero heating, constant, and random heating seemed as was physically predicted.
4. The env was tested with a random agent i.e., one that applies a random action until the game terminates. Each time, the game terminated (temperature of the can rose above 60 C) in 150-200 timesteps (25-35 min : expected time while running in the lab).
It seems like this basic testing environment is ready to be used with a learning algorithm that would try and maintain the temperature.