SQL Keys in Depth

par Admin · 2 janvier 2018

facts image

The fetch is stuffed with dogmatic commandments for selecting and the consume of keys in relational databases. Now and then it verges on a holy battle: could well well mute you utilize pure or man made keys? Auto-incrementing integers, UUIDs?

After wading by strategy of sixty-four articles, skimming sections in five books, and asking questions on IRC and StackOverflow I feel I’ve build the pieces collectively and bear a recommendation to harmonize the assorted camps. Many arguments about keys boil down to wrong dichotomies and failures to acknowledge assorted aspects of peep.

Desk of Contents

Let’s damage the wretchedness down, and then build all of it collectively at the tip. First, what’s a key?

What are keys, if truth be told?

For the second neglect about critical keys, we’re shopping for the overall thought. A secret’s a column or columns that collectively bear no replica values across rows. Also the columns could well well mute be irreducibly inviting, that manner no subset of the columns has this strong level.

For example, buy into memoir a table for counting cards in a card recreation:

CREATE TABLE cards_seen (
  swimsuit text,
  face text
);

If we’re tracking a single deck (no repeated cards), then the combination of swimsuit and face is inviting. We don’t are attempting to list the identical swimsuit and face twice within the table on memoir of that will likely be redundant. If a card is within the table we’ve considered it, in any other case we haven’t.

We are in a position to and could well well request the database to build in force this constraint by adding:

CREATE TABLE cards_seen (
  swimsuit text,
  face text,

  UNIQUE (swimsuit, face)
);

Neither swimsuit nor face is inviting by itself. We are in a position to survey a pair of cards with the identical swimsuit, or with the identical face. As a result of (swimsuit, face) is inviting, but the particular person columns are now now not, we sing that the combination is irreducible, and (swimsuit, face) is a key.

Expanding the distress a runt bit, if we were tracking a pair of packs of cards, lets encompass a brand contemporary field to file the selection of times a card became considered:

CREATE TABLE cards_seen (
  swimsuit text,
  face text,
  considered int
);

Even supposing the triple (swimsuit, face, considered) desires to be inviting, it’s now now not a key since the subset (swimsuit, face) desires to be inviting too. That’s on memoir of two rows having the identical swimsuit and face but assorted values for considered could well well be contradictory facts. So (swimsuit, face) is a key and there don’t appear to be any assorted keys for this table.

Specialty constraints

In PostgreSQL primarily the hottest simply about add a assorted constraint is to bellow it straight as we did. The consume of indexes to build in force inviting constraints can be regarded as an implementation ingredient that will mute now now not be accessed straight. There’s no favor to manually make indexes on columns already declared inviting; doing so would factual replica the automatically-created index.

A table can bear bigger than one key, it’s no wretchedness. In level of fact lets mute bellow all keys so the database can build in force them.

Listed below are two examples of tables which bear a pair of keys.

-- Three keys
CREATE TABLE tax_brackets (
  min_income  numeric(8,2),
  max_income  numeric(8,2),
  tax_percent numeric(Three,1),

  UNIQUE(min_income),
  UNIQUE(max_income),
  UNIQUE(tax_percent)
);

-- Two keys
CREATE TABLE flight_roster (
  departure timestamptz,
  gate text,
  pilot text

  UNIQUE(departure, gate),
  UNIQUE(departure, pilot)
);

For brevity the examples up to now are missing assorted constraints that in actuality they could well mute build in force. For example cards could well well mute now now not be considered a damaging selection of times. Also lets mute forbid NULL values for most columns regarded as up to now (other than the max_income column for tax brackets where NULL could well well could well well symbolize infinity).

The irregular case of critical keys

What we simply known as keys within the outdated fragment are traditionally known as candidate keys. “Candidate” is terminology which implies that the keys are all competing for the esteemed build of “critical key,” with the remainder relegated to “alternate keys.”

It took a while for SQL implementations to tremendous up misunderstandings about keys and the relational mannequin. The earliest database products fixated on what became then a low-stage thought of critical key.

Fundamental keys were required by these database systems to title the bodily discipline of a row on sequential get entry to media. Joe Celko explains:

The term “key” intended the form-key of a file that became required to enact any cheap processing with the usual sequential file systems. A deck of punch cards is learn in a single and simply one characterize; that that you simply must to well now now not “abet build” a card reader. The first magnetic tape drives mimicked that make and did now now not enable bi-directional get entry to. In level of fact, the usual Sybase SQL Server required that cursors needed to “re-wind” a table to the open and cross forward to learn a prior row.

Fashionable SQL is supposed to summary from the bodily representation. Tables mannequin family, and could well well now not bellow an implicit characterize of their rows. On the different hand even at this time time SQL Server creates a clustered index by default for critical keys, physically ordering rows within the used tradition.

In most databases critical keys bear survived as a vestige, and for the time being merely provide some conveniences barely than reflecting or figuring out bodily structure. For example declaring a critical key entails a NOT NULL constraint automatically, and defines the default international key goal in a PostgreSQL table. Fundamental keys furthermore give a marginally that their columns are most standard for joins.

A well-known key does now now not prevent declaring assorted keys. In level of fact a table with keys will characteristic dazzling when none is designated as critical. Lighting will now now not attain out of the sky and strike you.

Finding pure keys

UPC codes are a pure key

The keys we’ve considered up to now are “pure,” so known as on memoir of they are properties of the problem modeled, properties which bear intrinsic hobby despite the indisputable truth that no-one became attempting to make a key.

The first jabber to take hang of in suggestions when inspecting a table for possible pure keys is now now not to overthink it. StackExchange user sqlvogel advises:

Some americans appear to fight with the selection of “pure” key attributes on memoir of they hypothesize instances where a selected key could well well now not be inviting in some given population. This misses the level. The level of a secret’s to impose a enterprise rule that attributes must and can be inviting for the population of facts internal a selected table at any given level in time. The table persistently represents facts in a selected and confidently well-understood context (the “enterprise domain” AKA “domain of discourse”). It’s the arrangement/requirement to consume a strong level constraint internal that domain that matters.

The guideline of thumb is to add a key constraint when a column is inviting for the values at hand and can remain so in cheap eventualities. You’d persistently decide on the constraint if well-known. (If this worries you, we’ll focus on key balance later.)

For example, a database of hobbyist club people could well well encompass strong level on the 2 columns first_name, last_name of people. Duplicates are likely accidental at this scale, and if well-known the constraint could well well furthermore be dropped. Till an real battle occurs the secret’s an sensible precaution.

As a database grows to quilt extra ambitious fabric, finding pure keys could well well furthermore be extra refined. The suggestions we retailer is a simplification of exterior actuality, and doesn’t steal one of the most most things that in any other case distinguish objects within the area, corresponding to their changing build over time. In the absence of some extra or less item code, what distinguishes two cans of soda or boxes of cereal assorted than their build in build or exiguous deviations in weight or packaging?

Here is why standards bodies make and add distinguishing marks to products. Vehicles bear Automotive Identification Numbers (VIN) stamped onto them, books bear ISBNs printed internal, commodity meals packaging has UPC. You’d argue that these numbers don’t seem pure. Why am I calling them pure keys?

The naturalness or artificiality of inviting properties in a database is relative to the outdoor world. A key which became man made at beginning in some standards body or authorities company turns into pure to us on memoir of it’s most frequently agreed upon within the area at enormous, and/or imprinted on objects.

There are a variety of alternate, public and world standards for a gargantuan diversity of things alongside with currencies, languages, monetary devices, chemical substances and scientific diagnoses.

Listed below are some values that recurrently work as pure keys:

ISO 3166 nation codes
ISO 639 language codes
ISO 4217 forex codes
ISIN stock symbols
UPC/EAN, VIN, GTIN, ISBN
login names
email addresses
room numbers
mac address on a network
(lat,lon) for aspects within the area

As told, bellow keys at the same time as you happen to can and when cheap, even a pair of keys per table. But take into accout there are some caveats to take hang of in suggestions with the suggestions above.

Now not every person has an email address. Can also be dazzling for the supposed consume of the database. Also americans enact alternate their email addresses now and again. (More about key balance later.)
ISIN stock symbols enact alter over time, shall we embrace the symbols GOOG and GOOGL enact now now not precisely symbolize the reorganization of the corporate from Google to Alphabet. They can most frequently be complex, e.g. TWTR vs TWTRQ. Some traders incorrectly equipped the latter throughout Twitter’s IPO.
Social security numbers are particular to American electorate, bear privacy implications, and are reused after death. Also americans can get contemporary ones after identity theft. Finally the identical quantity can title each a person and a enterprise tax id.
Postal mail zip codes are a immoral selection for cities. More than one cities share a zip, and there are in most cases a pair of zips in a town.

Synthetic keys

man made keys are made up

On condition that a secret’s a column with inviting values in each row, one simply about make one is to cheat and throw made-up inviting values into each row. Synthetic keys are factual that: an invented code aged for referring to info or objects.

Crucially this code derives from the database itself and is not always identified by anybody other than users of the database. That’s what distinguishes man made keys from standards-primarily primarily based pure keys.

Whereas pure keys’ advantage is stopping table row duplication or contradiction, man made keys are vital on memoir of they get it easy for of us or assorted systems to consult with with a row, and furthermore toughen lookup and be half of bustle by avoiding string (or multi-column) key comparisons.

Surrogates

Folks furthermore consume man made keys as an anchor so as that irrespective of how enterprise suggestions and columns alternate, the identical row can persistently be identified the identical near. An man made key aged for this motive in most cases known as a “surrogate key” and requires special care. We’ll buy into memoir surrogates later.

Non-surrogate man made keys are helpful for referring to a row from outdoor a database. Whether or now now not uncovered in a URL, affixed to an bill, spoken over the phone, redeemed at a checkout counter or printed on a registration code, a person-made key succinctly identifies a truth or object. (A registration code is a pure key to you and me, but designed as a person-made key by the Dept of Motor Vehicles.)

Synthetic keys desires to be chosen with communication media in suggestions, to prick abet typos and errors. Build in suggestions whether the principle could well well mute be pronounced, learn in print, sent by strategy of SMS, learn in handwriting, entered into a pinpad, or embedded in a URL. Moreover some man made keys like bank card numbers have a checksum so as that if certain errors happen they can now now not decrease than be detected.

Examples:

US license plates bear suggestions about ambiguous characters corresponding to O vs Zero.
Hospitals and pharmacies are especially cautious, serious about doctors’ handwriting. Be taught about misidentification of alphanumeric symbols for many examples.
Texting a confirmation code? Be conscious to remain throughout the GSM 03.38 character position.
Now not like Base64 encoding for arbitrary byte facts, Base32 uses a restricted position of symbols which could well well furthermore be very without wretchedness aged by americans and processed by used pc systems which leer only restricted character devices.
Proquints are readable, spellable, pronounceable ids. They’re PRO-nouncable QUINT-uplets of alternating unambiguous consonants and vowels.

Be told that everytime you bellow a person-made key to the area americans bear uncommon reasons to ask custom unprecedented values. Perceive no extra than self-importance license plates or the system for making pronounceable ids that became the corrupt automated curse generator.

Sticking with numeric keys can work, even even supposing even if that is so americans bear numerical taboos like omitting the thirteenth ground. But numbers work well in many locations like URLs, pinpads, and for handwritten legibility if the recipient is conscious of the secret’s comprised of all digits. Even supposing proquints raise extra facts density per spoken syllable, numbers are reasonably gorgeous.

One jabber to steer clear of in publicly uncovered numbers is sequential characterize. It enables americans to probe for sources (/videos/1.mpeg, /videos/2.mpeg and loads others) and furthermore leaks cardinality facts. Add a Feistel cipher on high of a chain. This hides the ordering while asserting strong level.

The PostgreSQL pseudo encrypt wiki provides an instance cipher characteristic:

CREATE OR REPLACE FUNCTION pseudo_encrypt(VALUE int) returns int AS $$
DECLARE
l1 int;
l2 int;
r1 int;
r2 int;
i int:=Zero;
BEGIN
 l1:= (VALUE >> 16) & 65535;
 r1:= VALUE & 65535;
 WHILE i < Three LOOP
   l2 := r1;
   r2 := l1 # ((((1366 * r1 + 150889) % 714025) / 714025.Zero) * 32767)::int;
   l1 := l2;
   r1 := r2;
   i := i + 1;
 END LOOP;
 RETURN ((r1 << 16) + l1);
END;
$$ LANGUAGE plpgsql strict immutable;

This particular characteristic is its bear inverse (i.e. pseudo_encrypt(pseudo_encrypt(x)) = x). The usage of the characteristic verbatim is accomplish of security by strategy of obscurity and if somebody guesses you aged Feistel from the PostgreSQL docs then it’s easy to abolish the inverse to attain to the usual sequence. On the different hand that that you simply must to well alternate it up by supplying your bear characteristic barely than (((1366 * r1 + 150889) % 714025) / 714025.Zero). Your characteristic must accomplish a mark between Zero and 1. You doubtlessly can simply tweak the numbers within the outdated expression.

Here’s the vogue to make consume of pseudo_encrypt:

CREATE SEQUENCE my_table_seq;

CREATE TABLE my_table (
  short_id int NOT NULL
    DEFAULT pseudo_encrypt(
      nextval('my_table_seq')::int
    ),
  -- assorted columns …

  UNIQUE (short_id)
);

This resolution shops the randomized values within the short_id column. If the database desires to toughen high facts ingestion charges that that you simply must to well presumably as a alternative buy to retailer the incrementing sequence itself within the table and convert it for demonstrate on quiz with pseudo_encrypt. Indexing randomized values can lead to jot down amplification as we’ll discover later.

The outdated instance aged a frequent sized integer for the short_id column. To consume bigint we’d desire a special Feistel characteristic like XTEA.

One more simply about obscure an integer sequence as by reworking it into brief strings. Are trying the pg_hashids extension:

CREATE EXTENSION pg_hashids;

CREATE SEQUENCE my_table_seq;

CREATE TABLE my_table (
  short_id text NOT NULL
    DEFAULT id_encode(
      nextval('my_table_seq'),
      'prolonged string as table-particular salt'
    ),
  -- assorted columns …

  UNIQUE (short_id)
);

INSERT INTO my_table VALUES
  (DEFAULT), (DEFAULT), (DEFAULT);

SELECT * FROM my_table;
/*
┌──────────┐
│ short_id │
├──────────┤
│ R4       │
│ ya       │
│ Ll       │
└──────────┘
*/

All yet again it will likely be extra performant to retailer the integers themselves within the table and convert on quiz, but benchmark it and survey at the same time as you happen to in actual fact favor to add the complexity.

With a clear idea of man made and pure keys we can survey that the pure-vs-man made debate is a wrong dichotomy. Synthetic and pure keys are now now not mutually inviting! The identical table can bear each. In level of fact a table with man made key ought to build in force a pure key to boot, other than within the few cases where no pure key exists, corresponding to a table of coupon codes:

-- A uncommon table: no pure key candidates to
-- bellow alongside with the man made "code" key

CREATE TABLE coupons (
  code text NOT NULL,
  amount numeric(5,2) NOT NULL,
  redeemed boolean NOT NULL DEFAULT wrong,

  UNIQUE (code)
);

Having a person-made key and now now not declaring pure keys when they exist leaves the latter unprotected:

CREATE TABLE autos (
  car_id bigserial NOT NULL,
  vin varchar(17) NOT NULL,
  year int NOT NULL,

  UNIQUE (car_id)
  -- must bear added
  -- UNIQUE (vin)
);

-- This unfortunately succeeds
INSERT INTO autos (vin, year) VALUES
  ('1FTJW36F2TEA03179', 1996),
  ('1FTJW36F2TEA03179', 1997);

The sole argument against declaring extra keys is that each will entail one other inviting index which is in a build to delay the mark of table writes. Is dependent upon how unprecedented you mark facts correctness, so doubtlessly payment declaring the keys.

It’s payment declaring a pair of man made keys too if they exist. For example, a enterprise has Applicants and Workers. Every worker became once an applicant and refers to Applicants by their very bear id, which is desires to be a key of worker to boot. One more instance could well well be environment worker id and login title as two keys on Workers.

Surrogate Keys

surrogate keys are on the internal

As talked about above, a important extra or less man made secret’s known as a surrogate key. It’s now now not intended to be succinct or shareable like assorted man made keys, it’s intended as an internal placeholder that identifies a row forevermore. It’s aged in SQL and joins but now now not explicitly referenced by an utility.

If about PostgreSQL system columns that that you simply must to well presumably imagine surrogates as nearly a database implementation ingredient like a ctid other than never changing. A surrogate mark desires to be chosen once for each row and then never modified.

Surrogates get enormous targets for international keys, and the international keys desires to be marked as ON UPDATE RESTRICT to support build in force the immutability of the surrogate. On the assorted hand, international keys to publicly shared keys desires to be marked ON UPDATE CASCADE, which enables most flexibility to alternate these keys’ values. (The cascading update runs with the identical isolation stage as its surrounding transaction, so don’t terror about concurrency concerns, the database can address that at the same time as you happen to buy a strict isolation stage.)

Don’t “naturalize” surrogate keys. As rapidly as you demonstrate the mark of a surrogate key to your pause users, or worse yet enable users to work with the mark (presumably by strategy of search), you’ve got successfully given the principle enterprise that manner. The uncovered key in our database could well well then be regarded as a pure key in somebody else’s.

Forcing exterior systems to make consume of assorted man made keys explicitly designed for sharing enables us to alternate the uncovered keys as obligatory to accommodate changing necessities, while joining and asserting referential integrity with surrogates internally.

Auto-incrementing bigint

The most overall selection for surrogate keys is the auto-incrementing “bigserial” column, aka the IDENTITY column. (In level of fact, PostgreSQL 10 now helps the IDENTITY originate like Oracle does, survey CREATE TABLE.)

On the different hand I assume an auto-incrementing int is most frequently the contaminated selection for surrogate keys. My belief is irregular, so let me imprint.

Disadvantages of serial keys:

Starting all sequences at 1 and counting up manner that rows from assorted tables will share a great deal of the identical key values. This isn’t perfect. Disjoint devices of keys per table could well well be most standard so as that queries can’t by chance mix up the constants in a be half of, sing, and mute return outcomes. (As a workaround that that you simply must to well presumably organize each sequence as multiples of clear high numbers to remain disjointness, but that’s painstaking.)
Calling nextval() on sequences introduces shared schema relate that scales less without jabber across nodes in contemporary disbursed SQL systems being developed.
Merging facts from one other database that furthermore uses bigserial keys will battle with existing facts. The sequential values simply aren’t inviting across systems.
Philosophically the incrementing quantity harkens abet to used systems with an implicit row characterize. In characterize for you to characterize rows, enact it explicitly with a timestamp column or something well-known within the suggestions itself. Doing in any other case violates the first usual accomplish.
(Extinct reason, but) these runt ids are tempting to bellow.

UUID

Let’s buy into memoir one other possibility: the consume of an even bigger integers (128 bits) generated in a randomized sample. Algorithms for generating such universally inviting identifiers (UUIDs) are extraordinarily now now not going to ever generate the identical mark twice, even when urge simultaneously on assorted CPUs.

UUIDs seem like the pure selection for surrogate keys, gorgeous? In characterize for you to tag rows uniquely then nothing beats a assorted tag!

So why doesn’t every person consume them on PostgreSQL? There are some bogus reasons and one professional reason. Also there’s a workaround for the professional reason, and I’ll show benchmarks to illustrate.

First the synthetic reasons. Some americans assume UUIDs are strings on memoir of of the used dashed hexadecimal representation: 5bd68e64-ff52-4f54-ace4-3cd9161c8b7f. In level of fact some databases don’t bear a compact (128-bit) uuid form, but PostgreSQL does. It’s the size of two bigints, and that’s now now not an appreciable overhead when put next with the bulk of assorted facts within the database.

One more invalid criticism of UUIDs is that they’re unwieldy: who wants to narrate them, form them, or survey them? That’s agreeable for an uncovered man made key, but (by make) no one will ever survey a surrogate UUID. Maybe a developer operating SQL commands in psql to debug the system could well well address the UUIDs, but that’s about it. The developer could well well furthermore consult with rows with friendlier keys if these are defined.

The true wretchedness with UUIDs is that highly randomized values position off write amplification attributable to fleshy-page writes within the write-ahead log (WAL). This means worse performance when inserting rows. It if truth be told is depending on the algorithm aged for generating UUIDs, even supposing.

Let’s measure the write amplification. The fault lies with used filesystems, if truth be told. When PostgreSQL writes facts to disk it does so by modifying a “page” on disk. If a pc loses energy at a vital time then most filesystems could well well mute document the write as a success sooner than the suggestions is great on disk. If PostgreSQL naively regarded as the action complete then the database could well well be corrupted when the system goes abet on-line.

PostgreSQL can now now not belief that most OS/filesystem/disk configurations will build in force atomicity, so the database saves your complete relate of a newly modified disk page to the write-ahead log to be used in restoration after a possible break. Indexing highly random values like UUIDs tends to touch a complete lot of assorted disk pages, that manner writing the fleshy page size (in most cases 4k or 8k) to the WAL for each insertion. That’s known as a fleshy-page write (FPW).

Some UUID expertise algorithms (like Twitter’s “snowflake” or uuid_generate_v1() within the uuid-ossp extension for PostgreSQL) accomplish monotonically increasing values per machine. This consolidates writes into fewer disk pages, reducing FPW.

We are in a position to measure the PFW penalty for assorted UUID expertise algorithms and furthermore spy WAL statistics. To enact this I position up the following benchmark.

EC2 instance operating ami-aa2ea6d0
- Ubuntu Server 16.04 LTS (HVM)
- EBS Frequent Motive (SSD)
- c3.xlarge
- vCPU: four
- RAM GiB: 7.5
- Disk GB: 2 x Forty (SSD)
PostgreSQL built from source
- https://ftp.postgresql.org/pub/source/v10.1/postgresql-10.1.tar.gz
- ./configure --with-uuid=ossp CFLAGS="-O3"
Default database configuration other than:
- max_wal_size=‘10GB’;
- checkpoint_timeout=‘2h’;
- synchronous_commit=‘off’;

Schema:

CREATE EXTENSION "uuid-ossp";
CREATE EXTENSION pgcrypto;

CREATE TABLE u_v1 ( u uuid PRIMARY KEY );
CREATE TABLE u_crypto ( u uuid PRIMARY KEY );

Earlier than inserting UUIDs into each table, I stumbled on primarily the most modern build of the write-ahead log.

SELECT pg_walfile_name(pg_current_wal_lsn());

/* For example,

     pg_walfile_name
--------------------------
 000000010000000000000001
*/

I aged that build to get statistics about WAL usage after benchmarking. This gets the stats from events happening subsequent to a open build:

pg_waldump --stats 000000010000000000000001

I ran assessments for 3 eventualities:

Inserting UUIDs generated by gen_random_uuid() (equipped by pgcrypto)
Inserting from uuid_generate_v1() (equipped by [uuid-ossp] (https://www.postgresql.org/docs/10/static/uuid-ossp.html)
Inserting from gen_random_uuid() again, but with full_page_writes='off' within the db configuration. This reveals how quick it can well be without FPW amplification.

For every of these eventualities I began from an empty table and inserted 2^10 UUIDs. I repeated the insertions over sixteen rounds, timing each to search how the performance modified with extra facts within the table.

-- shall we embrace, I ran this 16 times with timing in psql

INSERT INTO u_crypto (
  SELECT gen_random_uuid()
  FROM generate_series(1, 1024*1024)
);

Listed below are the velocity outcomes:

Graph of UUID insertion bustle

Listed below are the WAL statistics for each near:

gen_random_uuid()

Kind         N      (%)   Document size      (%)   FPI size      (%)
----         -      ---   -----------      ---   --------      ---
XLOG       260 (  Zero.15)         13139 (  Zero.09)     484420 ( 30.ninety four)
Heap2      765 (  Zero.forty five)        265926 (  1.Seventy seven)     376832 ( 24.07)
Heap     79423 ( 46.fifty five)       6657121 ( Forty four.20)     299776 ( 19.14)
Btree    89354 ( fifty two.37)       7959710 ( fifty two.85)     404832 ( 25.85)

uuid_generate_v1()

Kind         N      (%)   Document size      (%)   FPI size      (%)
----         -      ---   -----------      ---   --------      ---
XLOG         Zero (  Zero.00)             Zero (  Zero.00)          Zero (  Zero.00)
Heap2        Zero (  Zero.00)             Zero (  Zero.00)          Zero (  Zero.00)
Heap    104326 ( Forty 9.88)       7407146 ( Forty four.Fifty six)          Zero (  Zero.00)
Btree   104816 ( 50.12)       9215394 ( fifty five.Forty four)          Zero (  Zero.00)

gen_random_uuid() with fpw=off

Kind         N      (%)   Document size      (%)   FPI size      (%)
----         -      ---   -----------      ---   --------      ---
XLOG         four (  Zero.00)           291 (  Zero.00)         Sixty four (  Zero.Eighty four)
Heap2        Zero (  Zero.00)             Zero (  Zero.00)          Zero (  Zero.00)
Heap    107778 ( Forty 9.88)       7654268 ( 46.08)          Zero (  Zero.00)
Btree   108260 ( 50.11)       8956097 ( fifty three.ninety one)       7556 ( ninety 9.16)

This confirms that gen_random_uuid generates a variety of WAL exercise as fleshy-page photos (FPI) and the assorted systems enact now now not. The closing one undoubtedly on memoir of I prevented the database from doing so. Suppressing FPW is now now not a near you are attempting to make consume of for true even supposing, except you’re very certain of your filesystem and disk configuration. This article means that ZFS can be great for disabling FPW, but consume warning.

The tremendous winner in my benchmark is uuid_generate_v1(). It’s quick and doesn’t appear to decelerate as rows earn. Also the uuid-ossp extension is pre-installed on hosted databases like RDS and Citus Cloud, so the characteristic will likely be on hand with no extra work.

The docs warn about uuid_generate_v1 that

This involves the MAC address of the pc and a time tag. Sign that UUIDs of this kind present the identity of the pc that created the identifier and the time at which it did so, which could well well get it inferior for certain security-sensitive capabilities.

On the different hand I don’t assume this is an scenario for us since the surrogate key obtained’t be uncovered. The library does supply a uuid_generate_v1mc() that obscures the mac address at the same time as you happen to’re mute eager.

Summary and Suggestion

Now that each person is conscious of the assorted kinds of keys and their uses, here’s my recommendation for selecting them to your bear database.

For every table:

Determine and bellow all pure keys.
Create a _id surrogate key of form uuid with default mark uuid_generate_v1(). You’d even put it as a critical key at the same time as you happen to love. Including the table title on this id makes joins extra fantastic. It’s JOIN foo USING (bar_id) vs JOIN foo ON (foo.bar_id = bar.id). Impress now now not bellow this key to clients or wherever outdoor the database.
For “be half of tables” bellow all international key columns as a single composite critical key.
Add a person-made key if desired to be used in a URL or wherever else you are attempting to share a reference to a row. Exhaust a Feistel cipher or pg_hashids to conceal auto-incrementing integers.
Save international keys to surrogate UUIDs as ON UPDATE RESTRICT and to exterior man made keys as ON UPDATE CASCADE. Exhaust your bear judgement for pure keys.

This near provides internal key balance while acknowledging and conserving pure keys. It furthermore decouples externally-coping with man made keys from both. With the correct figuring out we can cross previous a “critical key” fixation and like the assorted roles that keys play.

Be taught More

(Visité 28 fois, 1 aujourd'hui)

SQL Keys in Depth

Desk of Contents

What are keys, if truth be told?

Specialty constraints

The irregular case of critical keys

Finding pure keys

Synthetic keys

Surrogates

Surrogate Keys

Auto-incrementing bigint

UUID

Summary and Suggestion

Laisser un commentaire Annuler la réponse

Top vues

Catégories

Commentaires récents

SQL Keys in Depth

Desk of Contents

What are keys, if truth be told?

Specialty constraints

The irregular case of critical keys

Finding pure keys

Synthetic keys

Surrogates

Surrogate Keys

Auto-incrementing bigint

UUID

Summary and Suggestion

Laisser un commentaire Annuler la réponse

Top vues

Étiquettes

Catégories

Commentaires récents