On the previous page, we went over the basics of schema design. We discussed the importance of defining the nouns, adjectives, and verbs in the domain to help determine the basis for tables, columns, and the relationships between them. Over the next few pages, we’ll dive into common relationships such as “has_one”, “has_many”, and “many to many” and how to reason about them in both relational and graph databases. We’ll also look at how to define the relationship and elevate it to a first class citizen in our applications.
On the previous page, we used a basic example where a person has many addresses. This is great, but it’s not all that interesting. Instead, let’s build the foundation for a content management system or CMS. If you’ve ever used Wordpress, Drupal, Joomla or one of a million others, you should understand the concepts pretty well.
Let’s start by defining the core nouns. In its most basic form, a CMS must have web pages; however, there’s really no point in using a CMS unless you’re going to allow it to have templates. Otherwise, you’re better off just creating a bunch of HTML files and call it a day.
We’re keeping this super simple at this point. The only things that are critical are the fact that a web page has content, and a template has a header and a footer. We also want to allow each template to be used for many pages. Before it goes live, we’d probably also want name fields, statuses, etc but we’ll skip those for now.
The nouns are therefore web page and template. In a relational database, these will become tables. In a graph database, they become nodes. The web page has the adjective “content”, and the template has the adjectives “header” and “footer”. Therefore the tables might be created like this:
CREATE TABLE templates
(
id SERIAL NOT NULL
CONSTRAINT templates_pkey
PRIMARY KEY,
header TEXT,
footer TEXT
);
CREATE TABLE web_pages
(
id SERIAL NOT NULL
CONSTRAINT web_pages_pkey
PRIMARY KEY,
content TEXT
);
A template can have many web pages and a web page can have only one template. Or they can be newly created and not have a relationship defined yet.
In my experience, the “has_many” relationship is the most commonly used in any relational system. In graph databases, because the relationship is often more important than the data itself, the “many to many” relationship is the most common. We’ll get into this in just a minute when we talk about defining relationships.
In a “has_many” relationship, the table on the “many” side will contain a reference to the table on the “one” side of the relationship.
Going back to our example, the web pages table will get a column called template_id that will relate to the id column in the templates table. With that in mind, our CREATE TABLE
statements will look like this:
CREATE TABLE templates
(
id SERIAL NOT NULL
CONSTRAINT templates_pkey
PRIMARY KEY,
header TEXT,
footer TEXT
);
CREATE TABLE web_pages
(
id SERIAL NOT NULL
CONSTRAINT web_pages_pkey
PRIMARY KEY,
template_id INT
CONSTRAINT fk_template_web_pages
REFERENCES templates(id),
content TEXT
);
With the reference in place, we can have many web pages that are assigned to the same template, which is exactly what we want. This way, we can update our template when we want to add another link in the header and it will change for all of the pages assigned to it.
Several years ago, I built a SaaS based CMS from scratch. We made it possible for any customer to simply log in and edit their websites. Many different websites would use the same IP, hit the same proxy, and store their data in the same database. Because of this, the number of web pages in the database grew quickly.
The app determined which site to display based on the URL, and it knew which page to display based on the route in the URL. So let’s expand our schema to include these 2 concepts. We’ll add a sites table, its foreign reference, and a column called “slug” on the web pages table. Since spaces and other symbols aren’t super pretty on a URL (who wants to see a bunch of %20s), a slug is text that has had nonstandard characters converted to an underscore. Therefore, “My Puppy Pictures”, would turn into “my_puppy_pictures”.
So it’s easy to follow along, I’ll include the entire CREATE TABLE
statements. That way, if you’re skipping around, you won’t miss something along the way.
CREATE TABLE sites
(
id SERIAL NOT NULL
CONSTRAINT sites_pkey
PRIMARY KEY,
url VARCHAR(255)
);
CREATE TABLE web_pages
(
id SERIAL NOT NULL
CONSTRAINT web_pages_pkey
PRIMARY KEY,
template_id INT
CONSTRAINT fk_template_web_pages
REFERENCES templates(id),
site_id INT
CONSTRAINT fk_site_web_pages
REFERENCES templates(id),
slug VARCHAR(255) NOT NULL,
content TEXT
);
On the next page, we’ll shift gears and talk about “Has One” relationships.