Fix a bug where combined fp16 weights would be horribly wrong.
Seemingly weights were always returned as float, and then cast
to fp16_int_t -- without proper conversion! And sum_sq_error
would be calculated based on the correct value, not the broken-
casted one.
It's a small miracle the unit tests didn't catch this; they didn't
until I started introducing small errors for another reason.
Most real-world testing seems to have hit fp32, and thus this
wasn't caught there either.
Also make fp16_int_t a struct so that it is not implicitly
convertible to/from numeric types, so this never ever can happen again.