A more decoupled CKAN: making CKAN easier to customize, maintain and extend

Currently, CKAN implicitly has four distinct components:

  • Frontend – a component for all the "read" functionality e.g. displaying datasets, organisations etc; searching for datasets.
  • Admin UI – for creating and editing datasets etc.
  • API (logic layer) – the API for storing metadata and data and carrying out all the other operations needed by the Frontend and Admin UI
  • DB – the database ie. PostgreSQL (we also include SOLR and Redis too as part of the "DB" for the purposes of this discussion).

Because the DB is usually a separate service we might not strictly consider if part of CKAN as a project (hence the dashed line in the diagram).

Today CKAN is a monolithic application

At present, only the DB is explicitly separated as a distinct service. All the other components except the DB are in one big webapp/service “CKAN core”:

CKAN could decouple

CKAN could "decouple": with each major subcomponents becoming a separate, distinct service. This can happen incrementally and in an evolutionary manner: for example, we can start with the Frontend, then do the Admin UI etc.

ASIDE: why separate the frontend from the “admin interface”.  We think separating the two is a good idea for two reasons:

  • User Experience wise: separation is natural in terms of user experience both in data management systems and in other tools. For example, in most modern CMS such as Wordpress the UX flow for viewing blog posts is clearly distinct from the UX flow for editing and managing posts.
  • Developer Experience wise: these are distinct domains so being able to work on them separately is useful e.g. theming is key aspect of developing the read frontend but largely irrelevant to Admin UI.

Step 1: split frontend out into a separate service

Step 1: frontend split out to a separate service:

Step 2: split Admin UI out as a separate service 

Step 2: Split the Admin UI out as a separate service/app. Note: for the Admin UI it would make sense to make this a javascript single page application (SPA). This could then be embedded into the Frontend. However, from the the developer (and UX) perspective it would still be separate even if it was deployed "embedded". 

End result: three services

This would leave the current "core" of CKAN as the API service – no changes would be needed there and the end result would be:

What are the Benefits?

Decoupling in this way brings several major benefits:

  • Cleaner separation of concerns
  • Ability to develop different components with the appropriate technologies (e.g. API in Python, Frontend in JS)
  • Ability to have multi-speed development and deployment e.g. release an improvement to the frontend without having to release an update of the whole system.

You can read more about the benefits in https://tech.datopian.com/ckan-v3/

A Starting Point: the Frontend Component

The Frontend component would be a separate web application or service. It would retrieve its "data" from the existing CKAN action API (here "data" here means metadata for datasets or other content).

Because the Frontend is a separate application you can use build the frontend with whatever language or tooling you need. In particular, you can use a modern frontend webstack. We would propose using a React-based framework like Next.js combined with server side rendering on node (to keep SEO).

This new frontend is usable with CKAN as it is today – you could just use the new frontend and no longer make available the old CKAN frontend. It also fits with a fully decoupled CKAN of the future.

Benefits include:

  • Modern webstack
  • Build unified experience: build sites combining data catalog and CMS content with ease.
  • Ease of entry and deployment
  • Faster/more responsive

We have a reasonably advanced implementation of a next.js based system available at https://github.com/datopian/portaljs

You can read more about the idea of a decoupled frontend in this RFC: https://github.com/ckan/ideas/blob/master/rfcs/0005-decoupled-frontend.md

Colophon

This overview was originally written in May 2020 and shared on the CKAN tech team calls. The RFC was also drafted at that time.