I am trying to learn writing parallel programs or parallelization of codes for scientific computing, but struggling a bit to find the right place or resource to start with basics. I would be happy to take suggestions on any books, courses or website links to start with. Also if you have any other best ways to achieve a good proficiency in parallelization, please do share. Thank you so much for your time.
Do any of you guys have references for starting to learn this?
(Hari is a nice colleague so I am tagging people I think know this kind of “fancy computer stuff” to help him out with references)
It has been too long ago since I started learning this myself, so my memory is a bit hazy.
I certainly cannot tell which is the best way. Here are some points worth considering:
programming scientific software - and thus by extension parallel programming - is more like a craft than a science, so finding and memorizing “rules” from a text book is not going to work well
it is important to make sure that one has the basic skills, i.e. improve regular programming skills before even thinking about parallel programming. I strongly recommend to do this with a compiled language like Fortran or C (C++ can be a bit tricky) as this helps understand the data model. Graduating from there to doing parallel programming python is straightforward. Starting with python not.
like with any craft is is best learned by finding a “master” for in-person tutoring. many (super)computing centers have parallel programming courses as a way to get started
I always learn best when I acquire a skill with realizing a specific project in mind. however, it is important to pick reachable goals and not start with something too complex and convoluted.
as with most crafts, one is never done learning, so keeping your skill alive requires regular practice.
I don’t think the material that you study from matters a lot. I’ve probably done close to 50 parallel programming courses and workshops during my scientific career and I know my talk slides are passable at best, but it mattered the most whether I was able to inspire the participants to sit down and give it a try and not give up on trying to figure things out on their own (with a few pointers from the faculty here and there). I have recently decided to stop doing them, because they have had less and less impact since many of the participants do not have the desire to get their “hands dirty”. YouTube videos and AI tools seem to give people an undeserved confidence in what they are trying to do and it is very tiresome to correct that.
To not only have warm words and cautionary tales, here is a link to the website of an HPC workshop we did in Sao Paolo in December 2019 at the ICTP-SAIFR.
I agree with @akohlmey: best way to learn is to try to solve a problem you have more than going through the reference doc. Following a course can give foundations but you’ll have to put it into practice to grok the language/skill you want to learn.
I am not very proficient in parallel code writing but I can go through MPI instructions and understand what is going on more because I’ve understood some LAMMPS (and other codes) routines than because of the time I’ve spent reading the documentation linearly.
I think “native” parallel Python is still a pain to grasp because of the GIL. Fortunately there is mpi4py, but I agree that MPI is far easier to understand at low level programming (C, Fortran, Rust in a near future?).
As I know you are in France, have you heard of the IDRIS’s classes on the topic? The slides and materials are available online. PhD students and employees from french universities or CNRS can follow the course on-site for free (at Saclay) through the CNRS. They have a very good reputation.
Ahh nice !
Maybe if in Brazil we had fundings to afford this, one of these workshops could have become something like an MIT open courseware this: https://www.youtube.com/watch?v=mXkPCaZUXhg
Maybe not the “hands on” part but at least the part of the lectures. I always loved all the courses I watched in MIT Opencourseware and I always found them ultra valuable :v
I think this was mentioned by Rocio (mine’s and Hari’s supervisor). But I remember there was some problem (maybe it was because they were in french? @hariharansudhakar ). In any case I am not the one looking for it right now: this is Hari who wants to learn. He has a SUPER fancy PhD project that requires some super fancy computing skills :"D
Me I want to learn this one day, but there are many things I think I should be learning before “how to parallel program”. But in my case I wanted more like some decent (like basic) skills. The IT guy here in the CNRS gave me some suggestions of courses that I noted down ! I think I also have saved the reply you gave me once I asked something about “learning computer stuff” in a private message ^^
Whenever I have a less charged head and thus more energy I will totally get into it (together with learning astronomy and relativity stuff :v - really want to buy one of those books by Steven Hawking)
Worth noting that modern large language models are reasonably good at plagiarizing code coding from scratch.
You can’t just plug them into an existing code base: coding “into” an existing code base takes understanding, and LLMs don’t understand things. But you can pose a little toy problem to an LLM and ask them how they would write the MPI code, and then try out their code (or try debugging it!). This little process can help you to learn quite quickly.
For example, here is an MPI code for checking if a number is prime, as generated by Claude.ai. Note some fundamental patterns of MPI parallelism:
Breaking the problem into a state-free for loop (in pseudocode:
for each possible divisor:
if input number was evenly divisible, return "true"
if "true" was returned at any point:
input number was divisible
Using a deterministic, if possibly less efficient, protocol to divide up the for loop between workers
Using a single MPI gather at the end and reporting the result (notice that each worker already starts out “knowing” the total number of workers and its own worker “number” – so it immediately knows which part of the loop to work on without needing an independent communication).
I’m a little bit late for the discussion, so there isn’t much I can add, but I would like to underline one thing - the best parallel program is the one where each process work concurrently with no communication except gathering the results (something like the code @srtee shared).
Before you start to learn all about parallel programming, think whether your project can be similarly divided, because maybe you don’t need to learn much.
Personally, I never had any formal training on parallel programming and I learnt mostly from googling the problems I encountered :D. One protip - when searching for a solution to any problem add -before:2021 to your Google query to filter out AI-generated responses, which are often useless.