write_data function which was writing to STMESP data registers was
starting by writing words and tail was written using byte access.
However, RISCV core does not support unaligned access and on Cortex-M33
even if supported it is faster to do aligned access. Reworked
write_data to start first by writing data using byte or half word
access until data pointer is word aligned, then word access is used
and finally tail is written using byte or half word access.
Signed-off-by: Krzysztof Chruściński <krzysztof.chruscinski@nordicsemi.no>