arm64: optimize memcpy_{from,to}io() and memset_io()
Optimize memcpy_{from,to}io() and memset_io() by transferring in 64 bit
as much as possible with minimized barrier usage. This simplest
optimization brings faster throughput compare to current byte-by-byte read
and write with barrier in the loop. Code's skeleton is taken from the
powerpc.
Link: http://lkml.kernel.org/p/20141020133304.GH23751@e104818-lin.cambridge.arm.com
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Trilok Soni <tsoni@codeaurora.org>
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>