A Data Management System (DMS) is a framework. It can be used to create a variety of solutions such as Data Portals, Data Catalogs, Data Lakes (or Data Meshes) etc. We have developed two DMS stacks that share a set of underlying core components:
- CKAN: the open source data management system we created in 2007 and that we continue to develop and maintain. The main information on CKAN is at https://ckan.org/. Here we have some specific notes on how we develop and deploy CKAN as well as our thoughts on the next generation of CKAN (v3).
- DataHub: a simpler version of CKAN focused on SaaS platform at DataHub.io. DataHub and CKAN v3 share many of the same core components.
You can use a DMS to build many kinds of specific solutions
- Data Portals are gateways to data. That gateway can be big or small, open or restricted. For example, data.gov is open to everyone, whilst an enterprise "intra" data portal is restricted to its personnel.
- Data Catalog: see https://ckan.org/
- Metadata manager: see Publishing
- Data Lake: you can use a DMS to rapidly create a data lake using existing infrastructure. For example, using the DMS' catalog and storage gateway with existing cloud storage and data processing capabilities.
- Data Engineering: you can use components of the DMS to rapidly create, orchestrate and supply data pipelines.
A DMS has a variety of features. This section provides an overview and links to specific feature pages that include details of how they work in CKAN and CKAN v3 / DataHub.
There are many ways to break down features and this is just one framing. We are thinking about others and if you have thoughts please get in touch.
- Discovering and showcasing data (catalog and presenting)
- Views on data including visualizing and previewing data as well Data Explorers and Dashboards
- Publishing data
- Data API DataStore
- Permissions and Authentication
A DMS has the following key components:
- Data Flows and Factory
The Frictionless approach to data. See https://frictionlessdata.io/
Our team created this whilst at Open Knowledge Foundatioin and continue to co-steward it.
Service Reliability Engineering (SRE) and Developer Experience (DX) for our CKAN cluster technology.