Using associated data types to create unpacked data structures

Thu Aug 12 07:47:48 EDT 2010

On 12 August 2010 12:28, Johan Tibell <johan.tibell at gmail.com> wrote:
> As I understand it the generated code is not exported from the translation
> unit so there are no collisions at link time. We could do the same if we
> could force the generated type class instance to not be exported from the
> module.

Minor point: I think the standard practice is to export the code, but
mark it with an attribute that tells the linker to drop any duplicate
copies of the code associated with the name. So if I instantiate
vector<int> in A.cpp and B.cpp, then both A.o and B.o contain the code
for vector<int>, but upon linking these get commoned up so the final
executable only has one copy of the code (same mechanism as e.g.
COMDAT folding).

This produces a small space saving over the simple "instantiate at
every call site" model.

It is likely that GHC could do slightly better because we could see
whether any *modules we were dependent on* had previously generated
the necessary specialisation, and reuse that code directly if it had.
We would still need the COMDAT stuff to improve situations where we
depend on two modules that have independently generated the same
specialisation, and to deal with cycles in the module dependency
graph.

None of the mechanism for making this stuff happen is available at the
moment. It's an engineering problem that just needs time to be thrown
at it.

Cheers,
Max