Types of JOINs
06/15/2001Welcome to our continuing saga about SQL JOIN statements. We've spent the past few columns looking at the syntax and behavior of the inner join, the most commonly used type of join. This week, we'll take a look at what's going on behind the scenes when you perform a join and use that as a launching point to discuss the many different types of join statements that are supported by SQL.
How does a join work?
One of the fundamental challenges in understanding SQL is becoming comfortable with thinking about data in terms of mathematical sets and relational algebra. This is the similar to the challenge procedural programmers face when making the transition to object-oriented languages -- things are just simply different and the old rules simply don't apply. Joins force you to think in a set-oriented way. That is one of the reasons why they are one of the most difficult parts of SQL to learn.
So what actually happens when a join is executed? We can start with the simplest possible join -- the "cross join" (or Cartesian product). If we have two database tables consisting of information about CDs and musical artists:
|
Artists
|
|
| ArtistID | ArtistName |
| 1 | Peter Gabriel |
| 2 | Bruce Hornsby |
| 3 | Lyle Lovett |
| 4 | Beach Boys |
|
CDs
|
|||
| CDID | ArtistID | Title | Year |
| 1 | 1 | So | 1984 |
| 2 | 1 | Us | 1992 |
| 3 | 2 | The Way It Is | 1986 |
| 4 | 2 | Scenes from the Southside | 1990 |
| 5 | 1 | Security | 1990 |
| 6 | 3 | Joshua Judges Ruth | 1992 |
| 7 | 4 | Pet Sounds | 1966 |
A join simply multiplies the two tables together into a new virtual table. There are four members of the Artists table and seven members in the CDs table which will result in 28 (!) rows in the result. You can try this using the following syntax,
SELECT * FROM Artists, CDs
and you should see a result that looks like the following table:
| CDID | CDs.ArtistID | Artists.ArtistID | ArtistName | Title | Year |
| 1 | 1 | 1 | Peter Gabriel | So | 1984 |
| 1 | 1 | 2 | Bruce Hornsby | So | 1984 |
| 1 | 1 | 3 | Lyle Lovett | So | 1984 |
| 1 | 1 | 4 | Beach Boys | So | 1984 |
| 2 | 1 | 1 | Peter Gabriel | Us | 1992 |
| 2 | 1 | 2 | Bruce Hornsby | Us | 1992 |
| 2 | 1 | 3 | Lyle Lovett | Us | 1992 |
| 2 | 1 | 4 | Beach Boys | Us | 1992 |
| 3 | 2 | 1 | Peter Gabriel | The Way It Is | 1986 |
| 3 | 2 | 2 | Bruce Hornsby | The Way It Is | 1986 |
| 3 | 2 | 3 | Lyle Lovett | The Way It Is | 1986 |
| 3 | 2 | 4 | Beach Boys | The Way It Is | 1986 |
| 4 | 2 | 1 | Peter Gabriel | Scenes from the Southside | 1990 |
| 4 | 2 | 2 | Bruce Hornsby | Scenes from the Southside | 1990 |
| 4 | 2 | 3 | Lyle Lovett | Scenes from the Southside | 1990 |
| 4 | 2 | 4 | Beach Boys | Scenes from the Southside | 1990 |
| 5 | 1 | 1 | Peter Gabriel | Security | 1990 |
| 5 | 1 | 2 | Bruce Hornsby | Security | 1990 |
| 5 | 1 | 3 | Lyle Lovett | Security | 1990 |
| 5 | 1 | 4 | Beach Boys | Security | 1990 |
| 6 | 3 | 1 | Peter Gabriel | Joshua Judges Ruth | 1992 |
| 6 | 3 | 2 | Bruce Hornsby | Joshua Judges Ruth | 1992 |
| 6 | 3 | 3 | Lyle Lovett | Joshua Judges Ruth | 1992 |
| 6 | 3 | 4 | Beach Boys | Joshua Judges Ruth | 1992 |
| 7 | 4 | 1 | Peter Gabriel | Pet Sounds | 1966 |
| 7 | 4 | 2 | Bruce Hornsby | Pet Sounds | 1966 |
| 7 | 4 | 3 | Lyle Lovett | Pet Sounds | 1966 |
| 7 | 4 | 4 | Beach Boys | Pet Sounds | 1966 |
This table is typically filtered using the WHERE clause, for example
SELECT * FROM Artists, CDs WHERE Artists.ArtistID=CDs.ArtistID
which leads us to another nugget of SQL wisdom
SQL Wisdom #6) Using a cross join is almost always a bad idea
A cross join will typically bring your database to its knees since the amount of work increases as a multiple of the number of rows -- this does not scale linearly!
|
Also in aboutSQL: |
Other types of JOINs
We've covered two types of joins so far, but there are many more. To whet you appetite for the upcoming columns, the major types of joins we'll cover include
- CROSS JOIN (Cartesian product) is the simplest join, which we covered today;
- INNER JOIN (sometimes called the "EQUI-JOIN") where tables are combined based on a common column;
- OUTER JOIN which involves combining all rows of one table with only matching rows from the other table and next week's topic; and the
- SELF JOIN which is a table joined to itself.
The world of joins is another one of those pleasant corners of the SQL world where there is a lot of differentiation between database platforms as far as specific syntax and even which types of joins are supported. We'll cover the big picture for each type of join, but will only point out an whopping differences between the platforms. It is crucial when you are working with any complicated join, particularly one that involves multiple tables or nested joins, that you check you DBMS documentation to make sure that your approach is supported.
Next steps
This week we took a look at the cross join statement, the fundamental underpinning of the SQL join. Next week, we'll focus on the various "outer joins." Until then, feel free to contact me with comments and questions.
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 8 of 8.
-
hi
2007-09-18 00:05:12 lover17 [Reply | View]
oh it so great i learn all the meanings here! i really like this now i always open this site...thanks alot........
-
Please don't delete these tutorials!
2002-12-04 05:49:58 anonymous2 [Reply | View]
I use them frequently as reference!
-
When To Use It?
2002-03-06 21:32:02 abombss [Reply | View]
We recently had a problem where we needed to bring back a recordset with a row for every day ( between a range ) and every employee, regardless if the employee had data for that date. In order to accomplish this I used a cartesian product. Half the staff thought I was crazy and was breaking every rule of SQL. I knidly pointed out that in certain situations, like this one, it was the easiest way to solve the problem, and the Cross Join is the most basic of all joins.
I would be interested in hearing some other situations where a cross join was used.
Cheers
-
aboutSQL
2001-06-18 20:08:15 mhauser [Reply | View]
Hi John,
Thank you for writing these articles! I first learned SQL 4 years ago and I haven't used it very much since. I thought I had forgotten everything, but your articles really bring it back. And your explainations are much clearer than the professor's lectures(He was from Hong Kong and has quite an accent).
Now if you could just explain normalizing a table, my re-education would be complete!
Thanks again,
Marilyn Hauser
-
aboutSQL
2006-06-29 14:35:12 Gooser [Reply | View]
Best reference EVER* on Normalization is this:
http://www.marcrettig.com/poster/rettigNormalizationPoster.pdf
v/r
Gooser
* The use of 'EVER' suggests that I am omnipotent and omniscient, which I am not. But you ARE sure to enjoy this reference. There is some other good material on Marc's site, which you may like to read.





