multithreading / multiprocessing for optimization #343

dietmarwo · 2022-05-31T11:44:44Z

Did some experiments regarding
https://tespy.readthedocs.io/en/main/tutorials_examples.html#thermal-power-plant-efficiency-optimization
Using the given optimizer I noticed only 3% of my 16 core CPU were utilized. So I tried to apply an optimizer
capable of parallel function evaluation - but noticed strange results / exceptions. The following code
can be used to simulate the effect:

 def test():
    
    class calc (threading.Thread):
        
        def __init__(self, threadID):
            threading.Thread.__init__(self)
            self.threadID = threadID
            self.model = PowerPlant()

        def run(self):
            print ("Starting " + str(self.threadID))
            for i in range(100):
                try:
                    print(self.threadID, i, self.model.calculate_efficiency([40.0,1.0]))
                except Exception as ex:
                    pass
  
    for i in range(32):
        calc(i).start()

You see errors like

13:24:46-ERROR-Singularity in jacobian matrix, calculation aborted! Make sure your network does not have any linear dependencies in the parametrisation. Other reasons might be
-> given temperature with given pressure in two phase region, try setting enthalpy instead or provide accurate starting value for pressure.
-> given logarithmic temperature differences or kA-values for heat exchangers,
-> support better starting values.
-> bad starting value for fuel mass flow of combustion chamber, provide small (near to zero, but not zero) starting value.

As you can see, I create a separate model instance for each thread, but they seem not to be independent.
Why is that so? Is there anything I can do (setting a configuration parameter) to get this working?

For optimization for bigger models it would be nice if multi threaded execution would be supported.
Exceptions I can filter out, but unfortunately sometimes just wrong values are returned.
So I could "optimize" the powerplant to nearly 100% efficiency, but using the resulting x-value later
showed a "normal" efficiency.

My OS is Linux Mint 20.2 based on Ubuntu, python is anaconda for python 3.8.

The text was updated successfully, but these errors were encountered:

dietmarwo · 2022-05-31T12:26:21Z

Tried a workaround:

Using 'memorise_fluid_properties=False' in the network config

class PowerPlant():

    def __init__(self):
        self.nw = Network(
            fluids=['BICUBIC::water'],
            p_unit='bar', T_unit='C', h_unit='kJ / kg',
            iterinfo=False, memorise_fluid_properties=False)

solved the problem not completely, I still got strange values.
Additionally using thread local models in the fitness function:

        def __init__(self):
...
            self.local = threading.local()
            
        def get_model(self):
            try:
                return self.local.model
            except AttributeError:
                return self.new_model()
            
        def new_model(self):
            self.local.model = PowerPlant()
            return self.local.model
        
        def efficiency(self, x):   
            try: 
                eff = self.get_model().calculate_efficiency(x)    
                if not np.isfinite(eff):
                    return 0
            except Exception as ex:
                return 0  
            return eff
                         
        def fitness(self, x):
            y = -self.efficiency(x)
...

I still get strange outliers. May be the usage of static values like

# create dict for tespy fluids
TESPyFluid.fluids = {}

generally is a bad idea. Instead you could define fluids as local class member
of TESPyFluid (self.fluids = {}). Same with the Memorise dictionaries which should
be members of Memorise. Static variables have strange side effects in case of
parallel execution. See for instance https://volosoft.com/blog/Prefer-Singleton-Pattern-over-Static-Class
this issue is not related to Python, it is valid for all programming languages.

Now I only can recreate the model for each function evaluation, which slows down things
significantly. My CPU could handle about 600 model evaluations per second, if singletons
instead of statics were used. And there are much bigger CPUs avaiable already.

Update;

Found out that recreating the model in case of any error helps - together with using thread local models:

    def get_model(self):
          try:
              return self.local.model
          except AttributeError:
              return self.new_model()
      
      def recreate_model(self):
          self.local.model = PowerPlant()
      
      def efficiency(self, x):   
          try: 
              eff = self.get_model().calculate_efficiency(x)    
              if not np.isfinite(eff):
                  self.recreate_model()
                  return 0
              return eff
          except Exception as ex:
              self.recreate_model()
              return 0

So it seems that at some time an error occurs, and if you recreate the model in this case it helps. The

                if not np.isfinite(eff):
                    self.recreate_model()

is required. Seems that sometimes a nan value is computed which leaves the model in "bad shape".
Not sure this really has to do with multi threading. I could reproduce this issue also with single threaded
optimizers, but not with pg.ihs from your example.

dietmarwo · 2022-06-01T15:16:28Z

Now I have uploded a python script to actually reproduce the issue:

Execute https://github.com/dietmarwo/fast-cma-es/blob/master/examples/powerplant.py
after commenting out
https://github.com/dietmarwo/fast-cma-es/blob/61335c0278a7fcd9f241caedd5e7c003c01a987b/examples/powerplant.py#L260
which recreates the model in case of an error.

See the corresponding tutorial how to do parallel optmization with tespy:
https://github.com/dietmarwo/fast-cma-es/blob/master/tutorials/PowerPlant.adoc

I noticed that BLAS parallelization as configured by anaconda actually harms the performance of tespy simulations.
So I do now:

        def fitness(self, x):
            with threadpoolctl.threadpool_limits(limits=1, user_api="blas"):
                f1 = 1 / self.model.calculate_efficiency(x)

for pygmo fitness which both speeds up the simulation and reduces CPU consumption.
This setting is essential for parallel fitness evaluation, fcmaes does this automatically.

fwitte · 2022-06-01T16:43:33Z

Dear @dietmarwo,

thank you very much for reaching out, your investigations and suggestions are highly appreciated! I did not yet have the time to go through the code in detail, since I am currently working on TESPy in my free time only. Also, I do not have a lot of experience with multi-processing, so I will have a couple of questions, I guess.

Generally, implementing parallel processing might require some refactoring changes in the back-end. However, these might be benefitial for further developing the software in any case. There are some parts, that I might have set up differently, if I restarted now. If you'd be interested to support or advise, please let me know.

I will come back to this in a couple of weeks (will be in holidays :)) and I am looking forward to your improvement and example. Maybe we could link that from the documentation as well. Also, I would like to invite you to the user-meeting we are planning (https://tespy.readthedocs.io/en/main/regular_meeting.html).

Best regards

Francesco

dietmarwo · 2022-06-02T17:52:54Z

Hi Francesco,
in fact I finally got the parallelism working (by using thread local models).
I described the remaining issue here
https://github.com/dietmarwo/fast-cma-es/blob/master/tutorials/PowerPlant.adoc
Sometimes the model gets corrupted - even single threaded. I get nan as return value and after that strange efficiencies which are impossible high. Can be mitigated by recreating the model after receiving nan. So finally I think it is a minor issue now, which nevertheless should be resolved.
Refactoring would be nice, but as parallelism works now it is probably not at the highest prio. I will try to participate at the meeting, but I am not really a user. I maintain fcmaes, a kind of "competitor" to pygmo. Dario (the owner of pygmo) and me probably disagree what works better :-). I am mostly interested in having real world application examples for fcmaes, so may be we can cooperate in producing a more complex TESPy optimization example, may be involving more constraints and multiple objectives. Since paralellism works I don't see an issue in the performance of TESPy.

fwitte mentioned this issue Jun 1, 2022

Package revisions #295

Open

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multithreading / multiprocessing for optimization #343

multithreading / multiprocessing for optimization #343

dietmarwo commented May 31, 2022

dietmarwo commented May 31, 2022 •

edited

Loading

dietmarwo commented Jun 1, 2022

fwitte commented Jun 1, 2022 •

edited

Loading

dietmarwo commented Jun 2, 2022

multithreading / multiprocessing for optimization #343

multithreading / multiprocessing for optimization #343

Comments

dietmarwo commented May 31, 2022

dietmarwo commented May 31, 2022 • edited Loading

dietmarwo commented Jun 1, 2022

fwitte commented Jun 1, 2022 • edited Loading

dietmarwo commented Jun 2, 2022

dietmarwo commented May 31, 2022 •

edited

Loading

fwitte commented Jun 1, 2022 •

edited

Loading