How to return the duplicate data on multiple columns in SQL

  • How-Tos FAQs
  • December 17, 2018
Supercharge Your Snowflake SQL
with Datameer's
Data Transformation

We can return the duplicate data from a table on multiple columns in SQL using the GROUP BY and HAVING clause.

Let us consider the ‘orders’ table below.

CREATE TABLE orders (
       order_id INT,
       customer_name VARCHAR(100),
       order_date DATETIME,
       total_orders INT
);

INSERT INTO  orders
SELECT 1,'Jack', '2020-02-03', 4 UNION ALL
SELECT 2, 'Rose', '2020-01-09', 19 UNION ALL
SELECT 3,'Jack', '2020-02-03', 40 UNION ALL
SELECT 3,'John', '2020-02-03', 45 UNION ALL
SELECT 4, 'Rose', '2020-01-09', 17 ;

We can first list all the duplicate rows using the below query.

SELECT customer_name, order_date
FROM  orders
GROUP BY customer_name, order_date
HAVING COUNT(1) > 1

-- Output

# customer_name           order_date
--------------------------------------------------
Jack                      2020-02-03 00:00:00
Rose                      2020-01-09 00:00:00

Now using the above query, we can list individual duplicate rows using either IN or EXISTS operates. 

-- Returns duplicate data using IN operator
SELECT *
FROM orders
WHERE  (customer_name, order_date) IN (
       SELECT customer_name, order_date
       FROM  orders
       GROUP BY customer_name, order_date
       HAVING COUNT(1) > 1
);

-- Returns duplicate data using EXISTS operator
SELECT *
FROM orders o
WHERE  EXISTS (
       SELECT customer_name, order_date
       FROM  orders o1
    WHERE o.customer_name = o1.customer_name
              AND o.order_date = o1.order_date
        GROUP BY customer_name, order_date
        HAVING COUNT(1) > 1
);

-- Output of both above queries is exactly same

# order_id  customer_name  order_date              total_orders

------------------------------------------------------------------

1           Jack           2020-02-03 00:00:00     4
2           Rose           2020-01-09 00:00:00     19
3           Jack           2020-02-03 00:00:00     40
4           Rose           2020-01-09 00:00:00     17

Up Next:

Read How to calculate percentage in SQL?