Plan du site  
pixel
pixel

Articles - Étudiants SUPINFO

The fundamentals of SQL : part 3

Par Maxime FEVRIER Publié le 09/03/2019 à 19:16:17 Noter cet article:
(0 votes)
En attente de relecture par le comité de lecture

Introduction

To follow up on a series of articles on the same subject, we will continue to see SQL commands so that you can have a good foundation on this topic. This is why, in this article I will present you two other commands that will be very useful for you to do data manipulation on your tables. Among other things, they will allow you to group several results of the same table to no longer have duplicates, as well as to make aggregation functions on your data.

At first we will see the GROUP BY command will allow you to group data as the name indicate, with different aggregation functions. Subsequently, we will see the HAVING command allowing you to filter the data of a same table in order to have only the values that interest us. The latter is a complementary command because it is very often associated with the first one we will see. That's why I decided to group them in only one article.

Example table

So that you can have a concrete example on the use of the commands that we will see later, we will use the table below for all the demonstrations :

Table 1. Table 'customers'

id customer email city price date_of_purchase
1 Michel michel@contact.com Paris 120 20-01-2017
2 Valentin valentin@contact.com Troyes 500 12-02-2017
3 Nicolas nicolas@contact.com Nice 200 18-02-2017
4 Valentin valentin@contact.com Troyes 940 15-03-2017
5 Kevin kevin@contact.com Bordeaux 110 21-03-2017
6 Valentin valentin@contact.com Troyes 175 23-03-2017
7 Michel michel@contact.com Paris 123 24-03-2017
8 Jean jean@contact.com Marseille 754 08-07-2017
9 Nicolas nicolas@contact.com Nice 289 27-08-2017
10 Kevin kevin@contact.com Bordeaux 183 19-09-2017
11 Valentin valentin@contact.com Troyes 938 14-11-2017
12 Nicolas nicolas@contact.com Nice 192 17-12-2017

If however you wish to create this table, I put you below, the command to create this one with the addition of the data. Of course you can customize this one and the information it contains by adding or removing informations.

Creating the example table

CREATE TABLE IF NOT EXISTS customers (

id INT AUTO_INCREMENT PRIMARY KEY,

customer VARCHAR(100) NOT NULL,

email VARCHAR(255), city VARCHAR(255),

price VARCHAR(255),

date_of_purchase VARCHAR(255)

);

Adding data

INSERT INTO customers (customer, email, city, price, date_of_purchase)

VALUES

('Michel', 'michel@contact.com', 'Paris', '120', '20-01-2017'),

('Valentin', 'valentin@contact.com', 'Troyes', '500', '12-02-2017'),

('Nicolas', 'nicolas@contact.com', 'Nice', '200', '18-02-2017'),

('Valentin', 'valentin@contact.com', 'Troyes', '940', '15-03-2017'),

('Kevin', 'kevin@contact.com', 'Bordeaux', '110', '21-03-2017'),

('Valentin', 'valentin@contact.com', 'Troyes', '175', '23-03-2017'),

('Michel', 'michel@contact.com', 'Paris', '123', '24-03-2017'),

('Jean', 'jean@contact.com', 'Marseille', '754', '08-07-2017'),

('Nicolas', 'nicolas@contact.com', 'Nice', '289', '27-08-2017'),

('Kevin', 'kevin@contact.com', 'Bordeaux', '183', '19-09-2017'),

('Valentin', 'valentin@contact.com', 'Troyes', '938', '14-11-2017'),

('Nicolas', 'nicolas@contact.com', 'Nice', '192', '17-12-2017');

The GROUP BY command

As indicated in the introduction, the GROUP BY command will allow you to group several results in one and the same table without having duplicate informations. Using the example table above, we can see that there are several customers who have placed orders on different dates. We will therefore perform some operations on this table so that you can see what is capable this command.

SELECT column1, column2, ... FROM table_name GROUP BY column1;

To begin with, if we want to see a little more clearer by distinguishing all our customers, without having a duplicate, then we will use the following command :

Table 2. SELECT customer,email,city FROM customers GROUP BY customer;

customer email city
Jean jean@contact.com Marseille
Kevin kevin@contact.com Bordeaux
Michel michel@contact.com Paris
Nicolas nicolas@contact.com Nice
Valentin valentin@contact.com Troyes

As you can see, the above order will allow us to see who our customers are with their respective information. To do this, it was sufficient to use the SELECT command to select the desired columns from the "customers" table. Finally, we used the GROUP BY command followed by the name of the column containing our different customers (the 'customer' column) to get the result above.

SELECT column1, ... , function(column2) FROM table_name GROUP BY column1;

Now that we have our different customers, it can be interesting to do different operations to find out which client spent the least, who spent the most, and so on. This is what we will see thanks to the various aggregation functions below. They will allow you to calculate different things that could be useful.

The MIN() function

The MIN() function allows us to retrieve the smallest value. In the example below, we recover the cheapest purchase made for each clients.

Table 3. SELECT customer,email,city,date_of_purchase, MIN(price) FROM customers GROUP BY customer;

customer email city date_of_purchase MIN(price)
Jean jean@contact.com Marseille 08-07-2017 754
Kevin kevin@contact.com Bordeaux 21-03-2017 110
Michel michel@contact.com Paris 20-01-2017 120
Nicolas nicolas@contact.com Nice 17-12-2017 192
Valentin valentin@contact.com Troyes 23-03-2017 175


The MAX() function

The MAX() function allows us to retrieve the largest value. In the example below, we recover the most expensive purchase made for each clients.

Table 4. SELECT customer,email,city,date_of_purchase, MAX(price) FROM customers GROUP BY customer;

customer email city date_of_purchase MAX(price)
Jean jean@contact.com Marseille 08-07-2017 754
Kevin kevin@contact.com Bordeaux 19-09-2017 183
Michel michel@contact.com Paris 24-03-2017 123
Nicolas nicolas@contact.com Nice 27-08-2017 289
Valentin valentin@contact.com Troyes 15-03-2017 940


The SUM() function

The SUM() function allows us to add values. In the example below, this will allow us to get the total spent for each clients.

Table 5. SELECT customer,email,city, SUM(price) FROM customers GROUP BY customer;

customer email city SUM(price)
Jean jean@contact.com Marseille 754
Kevin kevin@contact.com Bordeaux 293
Michel michel@contact.com Paris 243
Nicolas nicolas@contact.com Nice 681
Valentin valentin@contact.com Troyes 2553


The AVG() function

The AVG() function allows us to obtain an average over several values. In the example below, this will allow us to have the average spent for each clients.

Table 6. SELECT customer,email,city, AVG(price) FROM customers GROUP BY customer;

customer email city AVG(price)
Jean jean@contact.com Marseille 754
Kevin kevin@contact.com Bordeaux 146.5
Michel michel@contact.com Paris 121.5
Nicolas nicolas@contact.com Nice 227
Valentin valentin@contact.com Troyes 638.25


The COUNT() function

The COUNT() function allows us to count the number of values present in the table. In the example below, this will allow us to have the total number of orders made for each clients.

Table 7. SELECT customer,email,city, COUNT(price) FROM customers GROUP BY customer;

customer email city COUNT(price)
Jean jean@contact.com Marseille 1
Kevin kevin@contact.com Bordeaux 2
Michel michel@contact.com Paris 2
Nicolas nicolas@contact.com Nice 3
Valentin valentin@contact.com Troyes 4


The HAVING command

In addition to all that we have seen previously, it is possible to add a condition to only have the values that interest us. The HAVING command allows us to select only certain data from a table by applying a filter on it. The great difference with the WHERE command is that it is possible to use this command with the different functions of aggregations that we could see a little earlier in this article.

SELECT column1, ... , function(column2) FROM table_name GROUP BY column1 HAVING function(column2) > value;

Table 8. SELECT customer,email,city, MAX(price) FROM customers GROUP BY customer HAVING MAX(price) > 200;

customer email city MAX(price)
Jean jean@contact.com Marseille 754
Nicolas nicolas@contact.com Nice 289
Valentin valentin@contact.com Troyes 940


As we can see from the example above, the HAVING command allowed us to have a filter on the table. Now, we only display the values greater than 200, unlike the table we saw a little earlier in the article where we would display all the values.

Table 9. SELECT customer,email,city, SUM(price) FROM customers GROUP BY customer HAVING SUM(price) > 300;

customer email city SUM(price)
Jean jean@contact.com Marseille 754
Nicolas nicolas@contact.com Nice 681
Valentin valentin@contact.com Troyes 2553


As with the previous command, here, we only retrieve values greater than 300 on the table representing the total spend for each clients.

Conclusion

In this article, we were able to see how to group several identical pieces of information from the same table so that we do not have duplicates. In addition to the basic commands we have added aggregation functions so that they return to us different information which could be useful thereafter. Finally, we saw an additional command allowing us to filter the results of a table. This last order is very often associated with the first one so that's why I wanted to present it to you one after the other. I hope this article has allowed you to learn a little more about this language. However, do not hesitate to practice, you learn a lot better by doing it.

A propos de SUPINFO | Contacts & adresses | Enseigner à SUPINFO | Presse | Conditions d'utilisation & Copyright | Respect de la vie privée | Investir
Logo de la société Cisco, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société IBM, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Sun-Oracle, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Apple, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Sybase, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Novell, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Intel, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Accenture, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société SAP, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Prometric, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Toeic, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo du IT Academy Program par Microsoft, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management

SUPINFO International University
Ecole d'Informatique - IT School
École Supérieure d'Informatique de Paris, leader en France
La Grande Ecole de l'informatique, du numérique et du management
Fondée en 1965, reconnue par l'État. Titre Bac+5 certifié au niveau I.
SUPINFO International University is globally operated by EDUCINVEST Belgium - Avenue Louise, 534 - 1050 Brussels