Appendix: Database Vocabulary

There are a few words that anyone wanting to understand databases should know.

  1. Cardinality, in regards to SQL, refers to the number of unique elements in a given set, whether that’s a column or predicate. An example of low cardinality is a boolean field where the only options are true and false. A high cardinality example would be an auto incrementing primary key, where every value is unique.1

    To complicate matters, the mathematical definition of cardinality is the total number of elements in a set rather than the unique elements.2 Depending on context, one or the other meaning may be implied. For example, Postgres has a mathematical function called cardinality that returns the total count in an array,3 but people often refer to an index as having high or low cardinality based on the number of unique elements in it.

  2. Selectivity The percentage of rows returned after the WHERE clause vs. the number of rows that would be returned without the WHERE clause. So if the table has 10,000 rows, and the query returns 100 rows after applying the filter, it would have a selectivity of 1%.

  3. Indexes vs. Indices are interchangeable and both refer to an index plurality.

  4. Postgres vs. PostgreSQL are interchangeable and both refer to the popular relational database server.

  5. Predicate is an attribute of a node in a graph database or a filter in SQL’s WHERE clause.

  6. Tuple is a finite ordered list of elements. It’s a data type in programming languages such as Elixir, and it’s interchangeable with a row in a relational database. For example, a tuple with 2 elements is a “double”, 3 elements is a “triple”, 4 elements is a “quadruple”, etc. They all basically end in “uple”.

  1. Cardinality - SQL statements 

  2. Cardinality - Mathematics 

  3. Array Functions