Reducing Memory Access
POLFED performance is often limited by memory bandwidth, not FLOPs. Reducing unnecessary reads/writes in mapping is therefore one of the strongest optimization directions with polfed.
Why This Matters
Spectral transformation is usually the most time-consuming part of POLFED. Since POLFED is highly memory-bound, reducing memory access in mapping can improve the whole algorithm.
Coalesced memory access is also important: contiguous/regular access patterns help runtime systems optimize memory traffic and can improve vectorization behavior (including SIMD on CPU).
Configuration
This path is most useful when your Hamiltonian has structure you can exploit. For the beginner Quantum Sun model, there is usually no trivial custom mapping that gives large gains. In XXZ model, the structure is richer and so there are more optimizations available. See Custom Mapping and GPU Implementation.
mapping = MappingConfig(
parallel_strategy = MulColsParallel(),
f!_rescaled = f_rescaled!,
Emin = Emin,
Emax = Emax,
)
vals, vecs, report = polfed(f!, x0, howmany, target;
produce_report=true,
mapping=mapping,
)
display_report(report)Here we use MappingConfig with MulColsParallel, then inspect Report via display_report.
Why Naive Rescaling Can Be Expensive
A naive rescaled mapping can be written as:
f!_rescaled_bad = (Y, X) -> begin
f!(Y, X)
@. Y *= 1 / spread
@. Y -= (center / spread) * X
endThis adds extra vector passes and therefore extra memory traffic. A direct model-specific rescaled mapping is usually better.
Even a simpler intermediate step can help:
- build a copied/rescaled matrix
mat_rescaled, - use
mul!(Y, mat_rescaled, X)as yourf!_rescaledmapping inMappingConfig.
This often gives a noticeable speedup versus repeatedly applying f!_rescaled_bad.
This page was generated using Literate.jl.