Find duplicates from the same table and constrain them from another table in sql

advertisements

Oh, my title is not the best one and as English is not my main language maybe someone can fix that instead of downvoting if they've understood the issue here.

Basically i have two tables - tourneyplayers and results. Tourneyplayers is like a side table which gathers together tournament information across multiple tables - results, tournaments, players etc. I want to check duplicates from the results table over column day1_best, from single tournament and return all the tourneyplayers who have duplicates.

Tourneyplayers contain rows:

Tourneyplayers
tp_id | resultid | tourneyid
1     | 2        | 91
2     | 21       | 91
3     | 29       | 91
4     | 1        | 91
5     | 3        | 92

Results contains rows:

Results:
r_id | day1_best
1    | 3
2    | 1
3    | 4
..   | ..
21   | 1
..   | ..
29   | 2

Now tourney with id = 91 has in total 4 results, with id's 1,2,21 and 29. I want to return values which have duplicates, so currently the result would be

Result
tp_id | resultid | day1_best
1     | 2        | 1
2     | 21       | 1

I tried writing something like this:

SELECT *
FROM tourneyplayers
WHERE resultid
IN (

SELECT r1.r_id
FROM results AS r1
INNER JOIN results AS r2 ON ( r1.day1_best = r2.day1_best )
AND (
r1.r_id <> r2.r_id
)
)
AND tourneyid =91

But in addition to values which had the same day1_best it chose two more which did not have the same. How could i improve my SQL or rewrite it?


First you JOIN both tables, so you know how the data looks like.

SELECT *
FROM tourney_players t
JOIN results r
  ON t.`resultid` = r.`r_id`;

Then using the same query you GROUP to see what tourneyid, day1_best combination has multiple rows

SELECT `tourneyid`, `day1_best`, count(*) as total
FROM tourney_players t
JOIN results r
  ON t.`resultid` = r.`r_id`
GROUP BY `tourneyid`, `day1_best`;

Finally you use the base JOIN and perform a LEFT JOIN to see what rows has a match and show only those rows.

SELECT t.`tp_id`, r.`r_id`, r.`day1_best`
FROM tourney_players t
JOIN results r
  ON t.`resultid` = r.`r_id`
LEFT JOIN (SELECT `tourneyid`, `day1_best`, count(*) as total
           FROM tourney_players t
           JOIN results r
             ON t.`resultid` = r.`r_id`
           GROUP BY `tourneyid`, `day1_best`
           HAVING count(*) > 1) as filter
  ON t.`tourneyid` = filter.`tourneyid`
 AND r.`day1_best` = filter.`day1_best`
WHERE filter.`tourneyid` IS NOT NULL;

SQL DEMO

OUTPUT