This post will explore how to find exclusive row relationships — rows that may be grouped together by some non-unique id that identifies some kind of grouping of those rows. For example, let’s say that we are managing a team of players, and that each team has to be unique. In other words, players A and B form Team 1, but players A,B,C form Team 2. Searching for existing teams with players A and B should only return Team 1, searching for teams with players A,B,C should return only Team 2, and searching for B,C returns an empty set. How to do this in SQL?
Let’s look at the table:
mysql> SELECT * FROM team_player_link; +---------+-----------+ | team_id | player_id | +---------+-----------+ | 1 | 100 | | 1 | 101 | | 1 | 102 | | 2 | 200 | | 3 | 300 | | 4 | 100 | | 4 | 101 | +---------+-----------+
As you can see, players 100 and 101 are actually playing for two teams, team 1 and team 4. BUT, if we search for an exclusive team these two players 100 and 101 are in, we should only get back team 4 and NOT team 1.
My starting point was here but this is actually the inverse of what we want — however, I didn’t think to use the IF statement the way he did (clever!).
Here is the solution:
SELECT team_id, player_id, SUM(IF(player_id=100 OR player_id=101, 1, 0)) AS s, COUNT(player_id) AS cnt FROM team_player_link GROUP BY team_id HAVING s=2 AND cnt=s;
What is going on here? I’ll explain it conceptually first. We look at all rows containing a unique team ID (GROUP BY team_id), and find player_ids that match either 100 or 101 (player_id=100 or player_id=101), assign 1 to each of those (IF statement) and sum all the 1s. But, if the sum is less than the number of players on the team (HAVING cnt=s AND s=2), we return nothing.
How does the code work? First, the IF part returns 1 for all player_ids that match our OR query, and the SUM sums that up.
Grouping by team_id ensures that we only do the sum for all the same team_ids.
Before we do the query from PHP we already know the team size (we are looking for a certain amount of ids, so the team size must be that big), so
HAVING s=2 ensures that we return only teams whose SUM is equal to 2 team members. But this doesn’t ensure that the team has ONLY 2 members, it just ensures that the players are on that team.
This is why we have the final HAVING condition: cnt=s. This ensures that the count of the players in a team is exactly the same as the number of players we searched for.