Parallel Parameter Scan

We revisit the basic example of a parameter scan (Parameter Scan) and show how to parallelize this scan on a single computer or on a computer cluster. To manage the job scheduling JCMsuite starts a central daemon on the local machine which delegates jobs to the registered computer resources.

Let’s come back to our example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
import jcmwave
import numpy as np
import os

jcmwave.daemon.shutdown()
jcmwave.daemon.add_workstation(Hostname ='localhost', 
                               Multiplicity = 2, 
                               NThreads = 1)

radii = np.linspace(0.3, 0.5, 40)
job_ids = []

# loop over radius values
for radius in radii:

    keys = {'radius': radius}
    # jcmwave_solve immediately returns  a job identifier
    job_id = jcmwave.solve('mie2D.jcmp', keys=keys, 
        temporary = True
        )
    job_ids.append(job_id)

# here we wait until all jobs are finished
results, logs = jcmwave.daemon.wait(job_ids = job_ids)    

# now we collect the computed scattering cross section 
# as in the the sequential example 
scattering_cross_section_scan = []
for iR in range(len(radii)):
    scs = results[iR][1]['ElectromagneticFieldEnergyFlux'][0][0].real
    scattering_cross_section_scan.append(scs)

# plot scattering cross section against rod radius
from matplotlib.pyplot import *
plot(radii, scattering_cross_section_scan, '-+', linewidth=2, markersize=14)
xlabel('radius [$\mu$ m]', fontsize=14)
ylabel('integral scattering cross section', fontsize=14)
axis('tight')
show()

The command in line 5 is explained later. In line 6 we add a computer resource. Here, we only use our local computer (localhost), but we allow for two simultaneous executions of the solver (Multiplicity=2) using one CPU core each (NThreads=1). Below we detail how to register a remote machine. With adding a computer resource JCMsuite is now running in daemon mode.

The lines 14-22 are the scan over the radius of the rod. The solver is called in line 18. By setting the option temporary to yes the project files are copied to a unique temporary directory which is deleted after the computation. This is necessary because simultaneous computations cannot run in the same project folder. Alternatively, one can use the option working_dir to set a unique directory name for each parameter value (e. g. working_dir = 'tmp/radius%s' % radius). Since JCMsuite is in daemon mode, jcmwave_solve immediately returns a job identifier for the submitted project. We collect these job ids in the array job_ids.

In line 25 the script waits until all jobs are finished. Then, the command jcmwave.daemon.wait returns all computed results in a cell array. For example results{5} corresponds to the job with id job_ids(5) and is exactly in the same format as for a sequential call of jcmwave.solve. Surely, since we have filled the array job_ids in the same order as the radius array, this is precisely the result for the parameter radii(5). Accordingly, the logging information (error message, etc.) for each individual job is returned. With the command jcmwave.daemon.wait the finished jobs are released from the daemon, so that their job identifiers become invalid.

Adding a remote computer

To add a remote computer we have to pass the hostname (say ‘maxwell.xyz.com’) of the computer, the login name of the user (say ‘james’) and the JCMsuite installation directory of the remote machine:

jcmwave.daemon.add_workstation(Hostname = 'maxwell.xzy.com',
                               Login = 'james',
                               JCMROOT = '<JCMROOT>',
                               Multiplicity = 20,
                               NThreads = 2)

This command allows us to use the remote computer with 20 simultaneous jobs, each running with two threads.

Note

It is required that you have a password free login to the remote server either over ssh or powershell (Windows).

All necessary file transfer from our local machine to and from the remote server is managed by JCMsuite. After finishing the job we will find all result files in the project folder of the local machine as we expect from a sequential call.

Warning

Be careful not to add a computer resource recurrently!

Calling the above command jcmwave.daemon.add_workstation twice will allow for 40 simultaneously running jobs on the remote computer! It is therefore advised to configure your computer cluster independently from starting a scan.

It is also possible to form a chain of subsequent logins into a computer behind a central gateway (say gateway.xzy.com):

jcmwave.daemon.add_workstation(Hostname = 'gateway.xzy.com;maxwell',
                               Login = 'james1;james2',
                               JCMROOT = '<JCMROOT>',
                               Multiplicity = 20,
                               NThreads = 2)

This will login into computer ‘gateway.xyz.com’ with user name ‘james1’ followed by a login to ‘maxwell’ with login name ‘james2’.

Daemon shutdown

With the command jcmwave.daemon.shutdown a running daemon will terminate and JCMsuite returns to the sequential mode. We have used this command at the top of the above script to avoid a recursive adding of computer resources.

The next section Persistent Parameter Scan demonstrates how to organize and store the results of parameter scans.