What is a Primary Key?
The primary key is something that uniquely identifies a row/record of data. It can also be multiple columns, which is called a composite.
Ability to Change
Because the primary key is often used for foreign references, it should be as stable as possible. All data in the database is mutable, providing someone is connecting with an account that has appropriate privileges. This is why databases provide the ability to define CASCADE ON DELETE and CASCADE ON UPDATE--to sync referential dependencies without having to disable constraints.
Natural or Artifical/Surrogate?
Ideally, you want a natural key. A natural key is existing data that uniquely identifies the entity you are modeling. For example, the abbreviations of US states is a good natural key because the abbreviation is consistent and everyone knows them:
US_STATE_PRIMARY_KEY US_STATE
--------------------------
AL Alabama
AK Alaska
AZ Arizona
AR Arkansas
CA California
Don't try too hard to find a natural key. They seldom exist. It's unlikely that a US State name would change, but it is plausible.
Realistically, primary keys will typically be artificial (often generated by database functionality). These are typically numbers or GUIDs, and they're considered artificial because on their own - there's nothing to relate their value to the information they uniquely identify. A sales receipt is always numbered, because there's nothing natural about it and it's also for auditing - gaps in the receipt numbers raise suspicions. To demonstrate how arbitrary numbering is, here's the US state table but using an integer for the primary key column, US_STATE_CODE:
US_STATE_PRIMARY_KEY US_STATE
--------------------------
100 Alabama
101 Alaska
102 Arizona
103 Arkansas
104 California
There's no requirement to start the value at one; some shops use this as a security measure to thwart SQL injection. The value is sequential based on the alphabetic ordering of the State name, but that can't be guaranteed. But unlike the natural key, if the state name changed - only one column would have to be updated.
Single Column vs Composite
Ideally one column will be the primary key, but make the decision based on the data at hand--do not combine columns just for the sake of having a single column. If you do shoehorn data together, use a character to separate the data easily (though operations to do this won't be able to take advantage of an index if present).
Performance
From a performance perspective, integers are best because they offer a decent range of values and the number of bytes used is small when you compare to VARCHAR of five or more characters.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…