A small followup to the last post:
- Sreepathi Pai pointed out that Z3 isn't the fastest solver when it comes to floats; CVC5 is faster but cannot optimize. However, OptiMathSAT, while non-free, detronizes Z3 on this problem; the minimization time goes down from eight hours to five minutes! (Pai also pointed to sat-smt.codes, if you want to read 650+ pages about SAT/SMT techniques.)
- I discovered that the desired range is not [0,16384) but [0,16384]; that is, you need to include 16384. This makes for slightly different optimal polynomials.
With that in mind, here are the final, bestest coefficients given the aforementioned constraints:
- Cosine: 1.0f + (x*x) * (-4.564684e-9f + (x*x) * 3.1372154e-18f); maxerr=7.3719024e-4
- Sine: x * (9.584001e-5f + (x*x) * (-1.4590816e-13f + (x*x) * 6.0535898e-23f)); maxerr=8.0764293e-5
The original LUT approach has maxerr 2.8767669e-4 (since it chops off the lowest two bits; presumably some rounding would be better), so the sine approach actually beats it in accuracy.