Nios II is well suited to customization and new instruction addition.
This is a 32-bit architecture, so the addition of a 2xF16 SIMD unit is natural.
Tensilica Xtensa Processor
The new SIMD added instruction set enforce Altivec instruction: a C code comming from a PowerPC G4 (74xx) or G5 (970) is directly compilable.
Since F16 are two time smaller than 32-bit FP, the architecture parallelism is 8 instead of 4, leading to a speedup of 2.