How to make best use of multithreading with monte carlo simulations?

James Source

I am writing a code to benchmark simulation algorithms using the basics of Monte carlo simulations - so I am generating a random system (just an integer) and running the simulation algorithms on the randomly generated system. I need to do this many many times and although the algorithms are relatively conceptually simple they take a few seconds to run because they contain many loops.

for i=1:number of algorithms
    for i=1:number of repeats
        if algo = 1
            //run the first algorithm [for loops]
        if algo = 2
            //run the second algorithm [while]
        if algo = 3
            //run the third algorithm [while]

where each algorithm works differently. The first algorithm can be further broken down into for loops where it is run many times and the highest score is selected so I imagine even the algorithm could be multithreaded. The other two would be much more complex to make multithreaded.

My question is how to split the program into different threads. There appears to be many different ways I could approach this and I am very new to multithreading so I have no idea what would be best.

Option 1: Split the threads immediately and run different algorithms on each thread.

Option 2: Split the threads with the second for loop, so the number of repeats are split up over the different threads.

Option 3: Try to break down the algorithm steps into smaller chunks which can be parallelized

cmultithreadingalgorithmperformanceparallel-processing

Answers

answered 1 week ago myradio #1

It depends in how long would it take to each repetition and each algorithm to be executed. Assuming that each repetition takes the same as the others (for each particular algorithm), most likely the best for these kind of cases is to trivially split the outer for in different threads (switching the two for loops to have repetitions that take the same time).

Running each algorithm in a different thread instead would have no advantage, since the algorithms are not going to take exactly the same time and you will end up wasting computational power.

Option 3 sounds very unlikely for a case like this. Besides the fact that you will have to think and make a significantly more complex program, I doubt you can gain something from paralellizing the different parts of the algorithm and I think is more likely that the code will be slower due to the different threads having to wait each other.

As a side note, as I said in the comments, for this very simple cases of parallelization I would recommend you to consider splitting the runs outside the C code but in a shell script. Each job you launch will be run in a different core and you will gain a lot of flexibility. You will also be able to run in in a cluster with almost no changes if any.

comments powered by Disqus