Speeding up Structural Featurization

I’m trying to do some structural featurization and as expected it’s starting to take time. I would like to know how I can speed things up.

Specifically (1) Would using numba help? (2) I don’t have a GPU but I can try using Colab. I don’t have any experience with cuda. Is there any documentation on how to use it with matminer?

Hi @R_Walser we currently do not have support for numba, but you can use the built in python multiprocessing. Matminer has support for this using the set_n_jobs method of any featurizer.

If you want the most speedup, you should set the number of jobs to equal to the total number of cores on the machine and then run on a large multicore single node with a lot of memory. Though you can also just do it on a laptop. Typically 4 jobs (set_n_jobs(4) before doing featurize_dataframe) works pretty well on most laptops and will give you almost 4x speedup.

1 Like