A small followup to the last post:

- Sreepathi Pai pointed out that Z3 isn't the fastest solver when it comes to
floats; CVC5 is faster
but cannot optimize. However, OptiMathSAT,
while non-free,
*detronizes*Z3 on this problem; the minimization time goes down from eight hours to five minutes! (Pai also pointed to sat-smt.codes, if you want to read 650+ pages about SAT/SMT techniques.) - I discovered that the desired range is not [0,16384) but [0,16384]; that is, you need to include 16384. This makes for slightly different optimal polynomials.

With that in mind, here are the final, bestest coefficients given the aforementioned constraints:

- Cosine: 1.0f + (x*x) * (-4.564684e-9f + (x*x) * 3.1372154e-18f); maxerr=7.3719024e-4
- Sine: x * (9.584001e-5f + (x*x) * (-1.4590816e-13f + (x*x) * 6.0535898e-23f)); maxerr=8.0764293e-5

The original LUT approach has maxerr 2.8767669e-4 (since it chops off the lowest two bits; presumably some rounding would be better), so the sine approach actually beats it in accuracy.