Does anyone know the method to update cuda on many nodes at the same time?

kwang · August 7, 2012, 1:38pm

Hello all,

Recently, after discussion with the administrator of the cluster
in our institute, he promised to update the cuda driver and
cuda toolkit to the newer version (the current version is
cuda driver 256 and cuda toolkit 3.1). As the result,
then I can accelerate my calculations by the new user-cuda package.

But he also posed a question to me, i.e., because the amount of
nodes in these cluster is large (100+), the work of updating driver
and toolkit will be time-consuming. He had already searched for
updating these nodes in one batch (I mean updating them at the same time),
but got no answers.

Does anyone know the method to update cuda on many nodes at the same time?
This question may go beyond the scope of this mail list, but I have no
idea where should I ask it properly.

Thank you in advance !

Kai Wang
Institutde of Metal Research, CAS

akohlmey · August 7, 2012, 2:16pm

Hello all,

Recently, after discussion with the administrator of the cluster
in our institute, he promised to update the cuda driver and
cuda toolkit to the newer version (the current version is
cuda driver 256 and cuda toolkit 3.1). As the result,
then I can accelerate my calculations by the new user-cuda package.

But he also posed a question to me, i.e., because the amount of
nodes in these cluster is large (100+), the work of updating driver
and toolkit will be time-consuming. He had already searched for
updating these nodes in one batch (I mean updating them at the same time),
but got no answers.

Does anyone know the method to update cuda on many nodes at the same time?
This question may go beyond the scope of this mail list, but I have no
idea where should I ask it properly.

there are a bunch of ways to do this.
if your sysadmin doesn't know, you should
get a new one. this is basic admin knowledge.

there are some commercial cluster deployment tools.
there are things like system imager and one can
just build custom rpms and use software like c3
to run commands on groups of nodes in parallel.

axel.