You have two tables foo and bar. You are performing different types of analysis on the items in foo, and writing that analysis to bar. Foo has a primary key "id". And bar has three columns: "foo_id", "analysis" and "version". The bar table contains the following entries.
1, blah, 3
1, erg, 2
2, erg, 2
This means that item 1 in foo has been analyzed for blah and erg versions 3 and 2 respectively. Item 2 has been analyzed for only erg version 2.
Now, lets say there are three items in foo and you want a query to pull back all of the items that need analysis for blah version 3. The right answer is items 2, and 3.
You may start with:
select * from foo LEFT JOIN bar.foo_id = foo.id
This returns 4 rows:
1, blah, 3
1, erg, 2
2, erg, 2
3, null, null
From this resultant table there is no good way to select foo.id's that have not been processed by the right version of blah. However, if we add a join constraint to our select we can. The select becomes:
select * from foo LEFT JOIN bar.foo_id = foo.id AND bar.analysis = 'blah'
This returns a beautiful table:
1, blah, 3
2, null, null
3, null, null
Essentially the left join takes place with only a subset of bar that meets the join constraint.
NOTE: This is entirely different from the following:
select * from foo LEFT JOIN bar.foo_id = foo.id WHERE bar.analysis = 'blah'
That query will only return the single row. To manipulate the original join you would probably end up using a "NOT EXISTS IN" kind of construct. Yuck.
The JOIN with a constraint presents a nice clean simple solution that lets us figure out which foo items still need analysis.
No comments:
Post a Comment