k5start: Why should manage the Kerberos tickets outside of the application?
Why do you need K5start?
1. A server-type process needs Kerberos credentials, and possibly also derived credentials in order to operate. A keytab is available for obtaining credentials. A tool is needed to obtain and re-obtain credentials using the keytab at the intervals necessary to ensure that credentials are always available for the process.
2. A long-running user job, perhaps a large computation, needs credentials (as above) in order to operate. The user can obtain credentials with a renewable lifetime long enough for the job. A tool is needed to renew credentials at the intervals necessary to ensure that credentials are available until the job ends or the renewable lifetime expires.
So now you know that tool is k5start
What is k5start?
k5start is a modified version of kinit that can use keytabs to authenticate, can run as a daemon and wake up periodically to refresh a ticket, and can run single commands with their own authentication credentials and refresh those credentials until the command exits.
k5start can be used as an alternative to kinit, but it is primarily intended to be used by programs that want to use a keytab to obtain Kerberos credentials
- Long Running Job issue
If you have some Hadoop background you must have come across this issue where long-running jobs get stopped after the ticket expired. Or you need the logic in application code to take care of renewing this ticket.
Another known problem when you -proxy-user argument with spark-submit which allows you to run a Spark job as a different user, besides the one whose keytab you have. the Kerberos tickets can’t be refreshed with this option and so long running job (ex: streaming) will fail as soon as the ticket expires.
With K5start we can get rid of this problems.
That said, the recommended way of submitting jobs is always to use the — principal and — keytab options. This allows Spark to both keep the tickets updated for long running applications, and also seamlessly distribute the keytab to any nodes that happen to need it (i.e. if they are spinning up an executor).
How to get the k5start?
#Add the epel repo to yum
$ yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm
- Install kstart:
$ yum install kstart.x86_64
How to use K5start? ( Examples )
- Use the /etc/krb5.keytab keytab to obtain a ticket granting ticket for the principal host/example.com, putting the ticket cache in /tmp/service.tkt. The lifetime is 10 hours and the program wakes up every 10 minutes to check if the ticket is about to expire
$ k5start -k /tmp/service.tkt -f /etc/krb5.keytab -K 10 -l 10h \
host/example.com
- Do the same as above, but using the default ticket cache and run the command /usr/local/bin/auth-backup. k5start will continue running until the command finishes.
k5start -f /etc/krb5.keytab -K 10 -l 10h host/example.com \
/usr/local/bin/auth-backup
Conclusion:
In this article, we learned how to use k5start to our advantage so that we could avoid some typical scenarios. Please comment down and share in what other scenarios you use k5start or any alternative you use.