Using "c=0" with maps

Hello:

I am trying to use maps with "c=0", i.e., assuming the same computational cost for all structures, no matter their size.

However, maps gets stuck in finding the next structure when doing so. I thought maps chose the next structure based on the gain:
G_i = tr[…] / C_i

where tr[…] is the trace of an expression involving the design matrix X (columns are correlation functions) of what has been seen so far, and a candidate structure i, and C_i is the estimated cost of structure i which, by default, is N_i^3 (where N_i is number of atoms in the unit cell of the structure).

There should not be anything wrong with using N_i^0 = 1 ? In this case, we simply compare the variance reduction of structures without accounting for their relative costs.

I think this is what happens:
Maps stops looking at structures with larger unit cells when:
V_max / C_i < G_max

where G_max is the largest gain seen so far. V_max is the theoretical upper limit to the gain.

But if C_i = 1 we are only stopping if any of the structures seen so far has a gain value larger than the theoretically possible value? But this rarely happens I think. I am not even sure a "<=" would help either (it would make things better, but not guaranteed to work). What I mean is, it can be tough finding the structure with exactly the largest possible gain.

Maybe there should be some user-defined threshold of the theoretically largest possible value (in the case of no cost)? Such as "when the max seen pool has 80 % of the largest possible, stop".

In any event, the conclusion seems to be that maps, in its current implementation, cannot run in a mode that assumes no computational cost of structures. I would love to hear that I am wrong about this.

Finally, I understand that maps should have a tough time using c=0 because I am asking for the correct ground state line among an infinite set of structures. At least c>0 limits the search in its restriction to smaller sized structures. But there could be scenarios where you want c=0 and then maybe have some other means to restrict the search (such as the proposed 80 % user defined threshold).

Any thoughts?

With "-c=0", you are using the code in a way I had not intended :wink:
You are correct that the code would never stop, because it will not find a structure yield a reduction in variance better than the theoretical max (which exceeds the feasible max even over an infinite set of structure).
Since you’ve found the line of the code responsible you can feel free to include your 80% factor, but you will hit another hurdle: the enumeration problem becomes very expensive for larger cells.
(It helps to remove the option -DSLOWENUMALGO from the makefile, but only so far.)

I am note sure what you are trying to do. Perhaps a systematic enumeration is what you want, e.g.,
as in the genstr code of ATAT?

Thanks for your reply Dr. Walle.

Basically, I am trying to allow maps to choose, with no cost (that is, microcanonically? ha), among a pre-specified sized structure pool (e.g., up to 12 nb atom/unit cell).

Actually, I need maps to only start making informed choices about which structure to add next when it has seen some minimum sized structure pool (that is, the user says: among all structures up to 6 nb atoms/unit cell which is the next best to add. maps, as of right now, will not necessarily do this. It could stop at 2 nb atoms/unit cell). And in my case I want this pool to actually stay fixed for all structure additions (i.e., maps should not worry about expanding the pool, but I realize this is not possible).

This feature is useful for benchmarking purposes.

I think the solution is to just take out the algorithm of maps and implement my own fixed-data-pool-maps. I appreciate you making this code available open source, it is very useful.

You could also play with the function calc_structure_cost to achieve a similar effect. But feel free to code around my stuff for your own experimentation.