Welcome! Log In Create A New Profile

Advanced

Hardware Crypto Poking - OpenSSH?

Posted by Kurlon 
Hardware Crypto Poking - OpenSSH?
January 03, 2025 08:30AM
I've started poking at getting Kirkwood hardware crypto playing ball again with OpenSSH. Digging around I found two major showstoppers with the current versions:

- Privilege Separation
- Digests

On Priv Sep, the answer in the past was to disable it in OpenSSH, which seems wrong given how many safeties it provides. Testing with af_alg (It's in kernel, openssl supports it out of the crate) I got the normal fail when flipping it on, ssh'ing in would die before getting to a login prompt and log:

sshd-session[18711]: fatal: ssh_sandbox_violation: unexpected system call (arch:0x40000028,syscall:281 @ 0x769fd18c) [preauth]

Ok, so that's a start. Did some digging, ended up adding the following to openssh's sandbox-seccomp-filter.c :

#endif
#ifdef __NR_socketcall
	SC_ALLOW_ARG(__NR_socketcall, 0, SYS_SHUTDOWN),
	SC_DENY(__NR_socketcall, EACCES),
#endif

/* Kurlon testing alfag */
#ifdef __NR_socket
	SC_ALLOW(__NR_socket),
#endif

#if defined(__NR_ioctl) && defined(__s390__)

This satisfies the syscall firewall. I need to narrow the definition though, the domain and type are fixed for OpenSSL in this case, I just need to test narrowing the scope similar to what is done for NR_socketcall next.

On the digests issue, in my openssl.cnf instead of enabling all algorithms, I just did the ciphers, no hashes.

# Turn on AF_ALG hardware crypto
openssl_conf = openssl_def

[openssl_def]
engines = openssl_engines

[openssl_engines]
afalg = af_alg_engine

[af_alg_engine]
#default_algorithms = ALL
default_algorithms = =aes-128-cbc aes-192-cbc aes-256-cbc des-cbc des-ede3-cbc

To preference hardware crypto I've set this in both sshd_config and ssh_config:

Ciphers ^aes128-cbc

So far, this has me able to ssh in, I need to setup a box that advertises aes for me to verify ssh'ing out to. It's possible to use a custom openssl.cnf just for openssh, so digests can be hardware elsewhere, though based on benchmark data I've seen I don't know as they really do offer much improvement over software on this platform? There is also discussion that hardware hashes are broken in other ways in OpenSSL, so may be best to ignore them. Once I get the sandbox rule refined I'll start chasing with OpenSSH upstream to see if that portion can be merged into mainline.

Performance wise, af_alg is known to be slower than cryptodev, burning sys time for context switches/etc. Doing a 2gb pure random file scp transfer, hardware aes128-cbc I got 6.5MB/s, CPU walled on my GoFlex. Software I got 6.4MB/s, CPU walled? Cryptodev appears to be a dead end evolution wise for the OpenSSL team, not sure if that's worth chasing again or not.

Additional numbers:
aes256-cbc - 5.7MB/s software vs 5.7MB/s hardware

root@gfn:/etc/ssh# time openssl speed -evp aes-256-cbc -engine afalg -elapsed
Engine "afalg" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing AES-256-CBC ops for 3s on 16 size blocks: 19687 AES-256-CBC ops in 3.00s
Doing AES-256-CBC ops for 3s on 64 size blocks: 19478 AES-256-CBC ops in 3.00s
Doing AES-256-CBC ops for 3s on 256 size blocks: 19197 AES-256-CBC ops in 3.00s
Doing AES-256-CBC ops for 3s on 1024 size blocks: 16497 AES-256-CBC ops in 3.00s
Doing AES-256-CBC ops for 3s on 8192 size blocks: 7417 AES-256-CBC ops in 3.00s
Doing AES-256-CBC ops for 3s on 16384 size blocks: 4633 AES-256-CBC ops in 3.00s
version: 3.3.2
built on: Sun Oct 27 14:19:50 2024 UTC
options: bn(64,32)
compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -fzero-call-used-regs=used-gpr -Wa,--noexecstack -g -O2 -Werror=implicit-function-declaration -ffile-prefix-map=/build/reproducible-path/openssl-3.3.2=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DZSTD -DNDEBUG -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 -Wdate-time -D_FORTIFY_SOURCE=2
CPUINFO: OPENSSL_armcap=0x0
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
AES-256-CBC        105.00k      415.53k     1638.14k     5630.98k    20253.35k    25302.36k

real	0m18.466s
user	0m0.817s
sys	0m10.254s

root@gfn:/etc/ssh# time openssl speed -evp aes-256-cbc -elapsed
You have chosen to measure elapsed time instead of user CPU time.
Doing AES-256-CBC ops for 3s on 16 size blocks: 1592995 AES-256-CBC ops in 3.00s
Doing AES-256-CBC ops for 3s on 64 size blocks: 549828 AES-256-CBC ops in 3.00s
Doing AES-256-CBC ops for 3s on 256 size blocks: 128594 AES-256-CBC ops in 3.00s
Doing AES-256-CBC ops for 3s on 1024 size blocks: 35895 AES-256-CBC ops in 3.00s
Doing AES-256-CBC ops for 3s on 8192 size blocks: 4758 AES-256-CBC ops in 3.00s
Doing AES-256-CBC ops for 3s on 16384 size blocks: 2378 AES-256-CBC ops in 3.00s
version: 3.3.2
built on: Sun Oct 27 14:19:50 2024 UTC
options: bn(64,32)
compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -fzero-call-used-regs=used-gpr -Wa,--noexecstack -g -O2 -Werror=implicit-function-declaration -ffile-prefix-map=/build/reproducible-path/openssl-3.3.2=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DZSTD -DNDEBUG -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 -Wdate-time -D_FORTIFY_SOURCE=2
CPUINFO: OPENSSL_armcap=0x0
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
AES-256-CBC       8495.97k    11729.66k    10973.35k    12252.16k    12992.51k    12987.05k

real	0m18.098s
user	0m17.217s
sys	0m0.079s

The gains may not be worth the chase using af_alg?



Edited 4 time(s). Last edit at 01/03/2025 09:12AM by Kurlon.
Re: Hardware Crypto Poking - OpenSSH?
January 03, 2025 08:03PM
--- ../../openssh-orig/openssh-9.9p1/sandbox-seccomp-filter.c	2024-09-19 18:20:48.000000000 -0400
+++ sandbox-seccomp-filter.c	2025-01-03 18:20:19.803149104 -0500
@@ -402,6 +402,12 @@
 	SC_ALLOW_ARG(__NR_socketcall, 0, SYS_SHUTDOWN),
 	SC_DENY(__NR_socketcall, EACCES),
 #endif
+
+	/* Kurlon testing alfag */
+#ifdef __NR_socket
+	SC_ALLOW_ARG(__NR_socket, 0, AF_ALG),
+#endif
+
 #if defined(__NR_ioctl) && defined(__s390__)
 	/* Allow ioctls for ICA crypto card on s390 */
 	SC_ALLOW_ARG(__NR_ioctl, 1, Z90STAT_STATUS_MASK),

I've sent this to the OpenSSH team to see if they'll adopt it for mainline.
Author:

Your Email:


Subject:


Spam prevention:
Please, enter the code that you see below in the input field. This is for blocking bots that try to post this form automatically. If the code is hard to read, then just try to guess it right. If you enter the wrong code, a new image is created and you get another chance to enter it right.
Message: