The core function of authentication is to Identify Users of the Portal (in a federated way) so we can base access on their identity.
There are 3 major conceptual components: Identity, Accounts and Sessions which come together in the following stages:
- Root Identity Determination: Determine Identity often via Delegation
- Sessions: Persistence of the identity in the web application in a secure way (without new identity determination on each request! I don't want to have to login via third party service every time)
- Account (aka profile): Storing Related Account/Profile Information in our application (not in third party identity) eg. email, name (other preferences)
- This will get auto-created usually at first Identification
- In limited case this can be seen as a cache of info from Identity system (e.g. your email)
- However often richer info that is app specific that is generated (relevant for personalization)
The identity determination can be done in multiple ways. In this article we're considering following 3 options that we believe are widely used:
- Password authentication - traditional username and password pair
- Single Sign-on (SSO) via protocols such as OAuth, SAML, OpenID Connect
- One-time password (OTP) via email or SMS (aka passwordless connection)
Traditional way of authentication of users. When signing up user provides at least username and password pair which is then stored in a database for future authentication processes. Normally, additional information such as email address, full name etc. is also requested when registering.
Examples of password authentication in popular services:
- GitHub - https://github.com/join
- GitLab - https://gitlab.com/users/sign_up
- NPM - https://www.npmjs.com/signup
The way of delegating identity determination process to some third-party service. Normally, popular social network services are used, e.g., Google, Facebook, Twitter etc. SSO implementations can be done using OAuth or SAML protocols. In addition, there is OpenID Connect protocol which is an extension of OAuth2.0.
- JWT based
- JSON based
- XML based
- SOAP based
List of OAuth providers:
Examples of SSO in popular projects:
Also known as dynamic password, OTP also solves limitations of traditional password authentication method. Usually, the one time passwords are received via email or SMS.
- Storage of user profile information (email, fullname, gravatar etc.)
- Retrieving user profile information via API
- Updating profile
- Deleting profile
- Log out: DePersisting the Session
- Invalidating all Sessions: e.g. if a security issue
- Sessions outside of browsers
When a user signs in, I want to know her/his identity so that I can limit access and editing based on who she/he is.
When a user visits the data portal for the first time, I want to provide him/her a way to register easily/quickly so that more people uses the data portal.
When I visit the data portal for the first time, I want to sign up using my existing social network account so that I don't need to remember yet another credentials.
When I'm using the CLI app (or anything else outside browser), I want to be able to login so that I can work from the terminal (e.g., have write access: editing datasets etc.).
In classic system, we have basic CKAN authentication. Below is how registration page looks like:
Registration flow in CKAN Classic:
We can extend basic CKAN authentication with:
- OAuth - see below
- SAML - https://extensions.ckan.org/extension/saml2/
CKAN Classic can also be used as OAuth client:
- https://github.com/conwetlab/ckanext-oauth2 - this is the only one that's maintained.
- https://github.com/etalab/ckanext-oauth2 - outdated, the one above is based on this.
- https://github.com/okfn/ckanext-oauth - last commit 9 years ago.
- https://github.com/ckan/ckanext-oauth2waad - Windows Azure Active Directory specific and outdated.
How it works:
We have considered some of popular and/or modern solutions for identity management that we can implement in CKAN 3:
Shortlist based on scores from the spreadsheet above:
All projects from the shortlist can be considered for a project. It worth to give a try for each of them and find out what works best for your project's needs. Testing out Auth0 should be straightforward and take less than an hour. AuthN and Ory/Kratos would require to build docker images and to run it locally but overall it should not be time consuming.
In datahub.io we have implemented SSO via Google/Github. Below is sequence diagram showing the auth flow with datopian/auth + frontend express app (similar to CKAN 3 frontend):
How does this conceptual framework map to an evolution of CKAN 2 to CKAN 3?
Sequence diagram of login process:
Kratos to Hydra in CKAN Classic:
- Does CKAN Classic allow us to store arbitrary account information (are there "extras")
- How would we avoid having to support identity persistence, delegation etc in both NG frontend and Classic Admin UI?
- Can we share cookies (e.g. via using subdomains)
- How is login, identity determination etc done at least for frontend in DataHub.io
- Should account UI really be in NG frontend vs Classic Admin UI?
- how can we handle "invite a user" to my org set up … (it's basically post processing after sign up …)
When a user visits the data portal, I want to provide multiple options for him/her to sign up so that I have more users registered and using the data portal.
When a user needs to change his/her profile info, I want to make sure it is possible, so that I have the up-to-date information about users.
When my personal info (email etc.) is changed, I want to edit it in my profile so that I provide up-to-date information about me and I receive messages (eg, notifications) properly.
When I decide to stop using the data portal, I want to be able to delete my account, so that my personal details aren't stored in the service that I don't need anymore.