Tuesday, September 22, 2009

MySQL: LEFT Join with Constraints

I just fixed a bug in a colleague's select statement. And I think it makes a good example. So here it is:

You have two tables foo and bar. You are performing different types of analysis on the items in foo, and writing that analysis to bar. Foo has a primary key "id". And bar has three columns: "foo_id", "analysis" and "version". The bar table contains the following entries.

1, blah, 3
1, erg, 2
2, erg, 2

This means that item 1 in foo has been analyzed for blah and erg versions 3 and 2 respectively. Item 2 has been analyzed for only erg version 2.

Now, lets say there are three items in foo and you want a query to pull back all of the items that need analysis for blah version 3. The right answer is items 2, and 3.

You may start with:
select * from foo LEFT JOIN bar.foo_id = foo.id

This returns 4 rows:
1, blah, 3
1, erg, 2
2, erg, 2
3, null, null

From this resultant table there is no good way to select foo.id's that have not been processed by the right version of blah. However, if we add a join constraint to our select we can. The select becomes:
select * from foo LEFT JOIN bar.foo_id = foo.id AND bar.analysis = 'blah'


This returns a beautiful table:
1, blah, 3
2, null, null
3, null, null

Essentially the left join takes place with only a subset of bar that meets the join constraint.

NOTE: This is entirely different from the following:
select * from foo LEFT JOIN bar.foo_id = foo.id WHERE bar.analysis = 'blah'


That query will only return the single row. To manipulate the original join you would probably end up using a "NOT EXISTS IN" kind of construct. Yuck.

The JOIN with a constraint presents a nice clean simple solution that lets us figure out which foo items still need analysis.

No comments: