How to return the duplicate data on multiple columns in SQL

How-Tos FAQs
December 17, 2018

Supercharge Your Snowflake SQL with Datameer's Data Transformation

We can return the duplicate data from a table on multiple columns in SQL using the GROUP BY and HAVING clause.

Let us consider the ‘orders’ table below.

CREATE TABLE orders (
       order_id INT,
       customer_name VARCHAR(100),
       order_date DATETIME,
       total_orders INT
);

INSERT INTO  orders
SELECT 1,'Jack', '2020-02-03', 4 UNION ALL
SELECT 2, 'Rose', '2020-01-09', 19 UNION ALL
SELECT 3,'Jack', '2020-02-03', 40 UNION ALL
SELECT 3,'John', '2020-02-03', 45 UNION ALL
SELECT 4, 'Rose', '2020-01-09', 17 ;

We can first list all the duplicate rows using the below query.

SELECT customer_name, order_date
FROM  orders
GROUP BY customer_name, order_date
HAVING COUNT(1) > 1

-- Output

# customer_name           order_date
--------------------------------------------------
Jack                      2020-02-03 00:00:00
Rose                      2020-01-09 00:00:00

Now using the above query, we can list individual duplicate rows using either IN or EXISTS operates.

-- Returns duplicate data using IN operator
SELECT *
FROM orders
WHERE  (customer_name, order_date) IN (
       SELECT customer_name, order_date
       FROM  orders
       GROUP BY customer_name, order_date
       HAVING COUNT(1) > 1
);

-- Returns duplicate data using EXISTS operator
SELECT *
FROM orders o
WHERE  EXISTS (
       SELECT customer_name, order_date
       FROM  orders o1
    WHERE o.customer_name = o1.customer_name
              AND o.order_date = o1.order_date
        GROUP BY customer_name, order_date
        HAVING COUNT(1) > 1
);

-- Output of both above queries is exactly same

# order_id  customer_name  order_date              total_orders

------------------------------------------------------------------

1           Jack           2020-02-03 00:00:00     4
2           Rose           2020-01-09 00:00:00     19
3           Jack           2020-02-03 00:00:00     40
4           Rose           2020-01-09 00:00:00     17

Up Next:

Read How to calculate percentage in SQL?

How to return the duplicate data on multiple columns in SQL

Related Posts

How to find the duplicate data in a table in SQL?

In SQL, how to limit the number of rows after ordering it...

How to select the first row of each GROUP BY in SQL in Da...

Product

Company

Resources

Sign up for our newsletter

Follow us on