Thanks for the reply. Yes, I have them deserialized, they were not explicit serialization previously. After switching to explicit serialization I can see that the time is no longer dominated by importing. But it is still quite slow, unfortunately.
The FW spec is not very large either - should be less than 10K at most.
I tried to use some profiling tools (
py-spy) to see where the time is mostly spent on.
Here is the profiling result for getting 1000 FWs.
You can see that most of the time is spent on
get_fw_dict_by_id. My understanding is that this is because this function has to be called on each FW to be retrieved, and within each call, it queries the fireworks collection to get the fireworks, and then two queries to launch collection to get the launch. These queries only return a small amount of data and doing them repetitively is bound to be slow (scales linearly with the number of FWs to be retrieved).
Maybe this part of the code is worth refactoring? I feel this is getting too technical, perhaps I should open an issue on the Github page…
When you say “seconds”, do you mean that getting things like 500 FWs should finish within seconds? It would take one around 2 minutes on my laptop