Sql – Unique index on two columns plus separate index on each one


I don't know much about database optimization, but I'm trying to understand this case.

Say I have the following table:

state_id integer
name varchar(32)
slug varchar(32)

Now, say I want to perform queries like this:

SELECT * FROM cities WHERE state_id = 123 AND slug = 'some_city'
SELECT * FROM cities WHERE state_id = 123

If I want the "slug" for a city to be unique within its particular state, I'd add a unique index on state_id and slug.

Is that index enough? Or should I also add another on state_id so the second query is optimized? Or does the second query automatically use the unique index?

I'm working on PostgreSQL, but I feel this case is so simple that most DBMS work similarly.

Also, I know this surely doesn't make a difference on small tables, but my example is a simple one. Think of 200k+ rows tables.


Best Solution

A single unique index on (state_id, slug) should be sufficient. To be sure, of course, you'll need to run EXPLAIN and/or ANALYZE (perhaps with the help of something like http://explain.depesz.com/), but ultimately what indexes are appropriate depends very closely on what kind of queries you will be running. Remember, indexes make SELECTs faster and INSERTs, UPDATEs, and DELETEs slower, so you ideally want only as many indexes as are actually necessary.

Also, PostgreSQL has a smart query optimizer: it will use radically different search plans for queries on small tables and huge tables. If the table is small, it will just do a sequential scan and not even bother with any indexes, since the overhead of working with them is higher than just brute-force sifting through the table. This changes to a different plan once the table size passes a threshold, and may change again if the table gets larger again, or if you change your SELECT, or....

Summary: you can't trust the results of EXPLAIN and ANALYZE on datasets much smaller or different than your actual data. Make it work, then make it fast later (if you need to).