如何使用MPI_Comm_spawn启动远程节点上的工作进程?远程节点上的mpi_comm_spawn
使用的openmpi 1.4.3,我试过这段代码:
MPI_Info info;
MPI_Info_create(&info);
MPI_Info_set(info, "host", "node2");
MPI_Comm intercom;
MPI_Comm_spawn("worker",
MPI_ARGV_NULL,
nprocs,
info,
0,
MPI_COMM_SELF,
&intercom,
MPI_ERRCODES_IGNORE);
但失败与此错误消息:
-------------------------------------------------------------------------- There are no allocated resources for the application worker that match the requested mapping: Verify that you have mapped the allocated resources properly using the --host or --hostfile specification. -------------------------------------------------------------------------- -------------------------------------------------------------------------- A daemon (pid unknown) died unexpectedly on signal 1 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. --------------------------------------------------------------------------
如果我取代 “节点2” 有名称我的本地机器,然后它工作正常。如果我ssh进入node2并在那里运行相同的东西(在info字典中使用“node2”),那么它也可以正常工作。
我不想用mpirun启动父进程,所以我只是寻找一种方法来动态生成远程节点上的进程。这可能吗?
谢谢。我想避免mpirun的原因是我正在写一个MATLAB mex文件来卸载一些计算。所以我只有一个MATLAB为我调用的C文件,这意味着主机名需要以编程方式进行指定。我想这意味着我必须以某种方式从我的mex文件的新进程中调用mpirun? – krashalot 2010-11-24 00:25:59