I dunno. Those vim users get a muscular left pinky from mashing Escape like a rat on a cocaine dispenser.
I'm pretty sure that a typical emacs user has more chorded Control-key presses than a typical vim user does Escape key presses.
But I'm not at all sure that that's true of actually toggling the respective keys up and down, and if you figure that that's most of the physical work... Like, if I hit C-x C-s and then C-x C-f in emacs, I'm not actually releasing the Control key between the four chorded keypresses. A vim user is gonna maybe smack Escape to go from insert to system mode, then do their :w|e or whatever. That'd be the same number of Escape and Control keypresses.
A step (sasl-step object) is an abstraction of authentication “step” which holds the response value and the next entry point for the authentication process (the latter is not accessible).
Looks like it's for if you're using an emacs IRC client or mail client or XMPP client, to do part of the authentication.