I’m a Data Science enthusiast and one of the main things I deal with is Data. A lot of it.
With more than 2.5 exabytes of data generated every day, it comes as no surprise that this data needs to be stored somewhere and accessed when required.
This article presents a hackable cheatsheet to get you up and running with SQL quickly!
What is SQL?
SQL stands for Structured Query Language. It is a language for relational database management systems. SQL is used today to store, retrieve and manipulate data within relational databases.
Here’s what a basic relational database looks like.
Using SQL, we can interact with the database by writing queries.
Here’s what an example query looks like
SELECT * FROM customers;
Using this SELECT statement, the query selects all the data from all the columns in the customer’s table and returns data like so:
The asterisk wildcard character (*) refers to “all” and selects all the rows and columns. We can replace it with specific column names instead — here only those columns will be returned by the query
SELECT FirstName, LastName FROM customers;
Adding a WHERE clause allows you to filter what gets returned:
SELECT * FROM customers WHERE age >= 30 ORDER BY age ASC;
This query returns all data from the products table with an age value of greater than 30.
The use of ORDER BY keyword just means the results will be ordered using the age column from the lowest value to the highest
Using the INSERT INTO statement, we can add new data to a table. Here’s a basic example adding a new user to the customers table:-
INSERT INTO customers(FirstName, LastName, address, email)VALUES (‘Jason’, ‘Dsouza’, ‘McLaren Vale, South Australia’, ‘test@fakeGmail.com’);
Of course, these examples demonstrate only a very small selection of what the SQL language is capable of.
Why Learn SQL?
We live in the age of Big Data, where data is used extensively to find insights and inform strategy, marketing, advertising and a plethora of other operations. Big businesses like Google, Amazon, AirBnb utilize large, relational databases as a basis of improving customer experience. Understanding SQL is a great skill to have not only for data scientists and analysts but for everyone.
How do you think that you suddenly got a Youtube ad on shoes when just a few minutes ago, you were Googling your favourite shoes? That’s SQL (or a form of SQL) at work!
SQL vs MySQL
Before we move on, I just want to clarify an often-confused topic — the difference between SQL and MySQL. As it turns out, they aren’t the same thing!
SQL is a Language
MySQL is a System to implement SQL
SQL outlines syntax that allows you to write queries that manage relational databases.
MySQL is a database system that runs on a server. It allows you to write queries using SQL syntax to manage MySQL databases.
In addition to MySQL, there are other systems that implement SQL. Some of the more popular ones include:
- Oracle Database
- Microsoft SQL Server
For most of the cases, MySQL is the preferred choice of a database management system. Many popular Content Management Systems (like Wordpress) use MySQL by default, so using MySQL to manage those applications may be beneficial. In order to use MySQL, you’ll need to install it on your system:
The recommended way to install MySQL on Windows is by using the MSI installer from the MySQL website.
This resource will guide you with the installation process:
How do I Run MySQL on Windows? | Liquid Web
If you’re using a Windows-based server to host your content, you may using Microsoft’s database server product, MSSQL…
On macOS, installing MySQL too involves downloading an installer.
This resource will guide you through the installation process:
MySQL :: MySQL and OS X :: 2 Installing MySQL on macOS Using Native Packages
The package is located inside a disk image (.dmg) file that you first need to mount by double-clicking its icon in the…
With MySQL now installed on your system, I recommend that you use an SQL management application to make managing your databases a much easier process.
There are lots of apps to choose from which largely do the same job, so it’s down to your own personal preference on which one to use:
- MySQL Workbench developed by Oracle
- phpMyAdmin (operates in the web browser)
- HeidiSQL (Recommended for Windows)
- Sequel Pro (Recommended for macOS)
When you’re ready to start writing your own SQL queries, consider importing dummy data rather than creating your own database.
Here are some dummy databases that you are available for download free of charge.
The Icing on the Cake — the Cheatsheet!
A collection of keywords used in SQL statements, a description, and where appropriate an example. Some of the more advanced keywords have their own dedicated section.
Where MySQL is mentioned next to an example, this means this example is only applicable to MySQL databases (as opposed to any other database system).
Comments allow you to explain sections of your SQL statements, without being executed directly.
In SQL, there are 2 types of comments, single line and multiline.
Single Line Comments
Single line comments start with ‘- -’. Any text after these 2 characters to the end of the line will be ignored.
-- This part is ignoredSELECT * FROM customers;
Multiline comments start with /* and end with */. They stretch across multiple lines until the closing characters have been found.
/*This is a multiline comment
It can span across multiple lines*/SELECT * FROM customers;/*This is another comment. You can even put code within a comment to prevent its executionSELECT * FROM icecreams;*/
Data Types in MySQL
When creating a new table or editing an existing one, you must specify the type of data that each column accepts.
Could Machine Learning and NLP Have Predicted Oil’s Crash? The Answer is YES. | Data Driven…
On April 20th, 2020, futures for WTI (Crude oil’s U.S. benchmark) reached for the first time in history negative…
In the example below, data passed to the id column must be an int (integer), while the FirstName column has a VARCHAR data type with a maximum of 255 characters.
CREATE TABLE customers(id int,FirstName varchar(255));
1. String Data Types
2. Numeric Data Types
3. Date/Time Data Types
1. Arithmetic Operators
+ -- Add
– -- Subtract
* -- Multiply
/ -- Divide
% -- Modulus
2. Bitwise Operators
& -- Bitwise AND
| -- Bitwise OR
^ -- Bitwise exclusive OR
3. Comparison Operators
= -- Equal to
> -- Greater than
< -- Less than
>= -- Greater than or equal to
<= -- Less than or equal to
<> -- Not equal to
4. Compound Operators
+= -- Add equals
-= -- Subtract equals
*= -- Multiply equals
/= -- Divide equals
%= -- Modulo equals
&= -- Bitwise AND equals
^-= -- Bitwise exclusive equals
|*= -- Bitwise OR equals
1. String Functions
2. Numeric Functions
3. Date Functions
4. Miscellaneous Functions
In SQL, Wildcards are special characters used with the LIKE and NOT LIKE keywords. This allows us to search for data with sophisticated patterns rather efficiently.
% -- Equates to zero or more characters.Example: Find all customers with surnames ending in ‘ory’.SELECT * FROM customersWHERE surname LIKE '%ory';_ -- Equates to any single character.Example: Find all customers living in cities beginning with any 3 characters, followed by ‘vale’.SELECT * FROM customersWHERE city LIKE '_ _ _vale';[charlist]Equates to any single character in the list.Example: Find all customers with first names beginning with J, K or T.SELECT * FROM customersWHERE first_name LIKE '[jkt]%';
In relational databases, there is a concept of primary and foreign keys. In SQL tables, these are included as constraints, where a table can have a primary key, a foreign key, or both.
1. Primary Key
A primary key allows each record in a table to be uniquely identified. There can only be one primary key per table, and you can assign this constraint to any single or combination of columns. However, this means each value within this column(s) must be unique.
Typically in a table, the ID column is a primary key, and is usually paired with the AUTO_INCREMENT keyword. This means the value increases automatically as and when new records are created.
Create a new table and setting the primary key to the ID column.
CREATE TABLE customers (id int NOT NULL AUTO_INCREMENT,FirstName varchar(255),Last Name varchar(255) NOT NULL,address varchar(255),email varchar(255),PRIMARY KEY (id));
2. Foreign Key
A foreign key can be applied to one column or many and is used to link 2 tables together in a relational database.
The table containing the foreign key is called the child key,
The table containing the referenced (or candidate) key is called the parent table.
This essentially means that the column data is shared between 2 tables, as a foreign key also prevents invalid data from being inserted which isn’t also present in the parent table.
Create a new table and turn any column that references IDs in other tables into foreign keys.
CREATE TABLE orders (id int NOT NULL,user_id int,product_id int,PRIMARY KEY (id),FOREIGN KEY (user_id) REFERENCES users(id),FOREIGN KEY (product_id) REFERENCES products(id));
Indexes are attributes that can be assigned to columns that are frequently searched against to make data retrieval a quicker and more efficient process.
CREATE INDEX -- Creates an index named ‘idx_test’ on the first_name and surname columns of the users table. In this instance, duplicate values are allowed.CREATE INDEX idx_testON users (first_name, surname);CREATE UNIQUE INDEX -- The same as the above, but no duplicate values.CREATE UNIQUE INDEX idx_testON users (first_name, surname);DROP INDEX -- Removes an index.ALTER TABLE usersDROP INDEX idx_test;
In SQL, a JOIN clause is used to return a result which combines data from multiple tables, based on a common column which is featured in both of them
There are a number of different joins available for you to use:-
- Inner Join (Default): Returns any records which have matching values in both tables.
- Left Join: Returns all of the records from the first table, along with any matching records from the second table.
- Right Join: Returns all of the records from the second table, along with any matching records from the first.
- Full Join: Returns all records from both tables when there is a match.
A common way of visualising how joins work is like this:
SELECT orders.id, users.FirstName, users.Surname, products.name as ‘product name’FROM ordersINNER JOIN users on orders.user_id = users.idINNER JOIN products on orders.product_id = products.id;
A view is essentially an SQL results set that gets stored in the database under a label, so you can return to it later without having to rerun the query. These are especially useful when you have a costly SQL query which may be needed a number of times, so instead of running it over and over to generate the same results set, you can just do it once and save it as a view.
To create a view, you can do so like this:
CREATE VIEW priority_users ASSELECT * FROM usersWHERE country = ‘United Kingdom’;
Then in future, if you need to access the stored result set, you can do so like this:
SELECT * FROM [priority_users];
With the CREATE OR REPLACE command, a view can be updated.
CREATE OR REPLACE VIEW [priority_users] ASSELECT * FROM usersWHERE country = ‘United Kingdom’ OR country=’USA’;
To delete a view, simply use the DROP VIEW command.
DROP VIEW priority_users;
The majority of websites and applications use relational databases in some way or the other and this makes SQL extremely valuable to know as it allows you to create more complex, functional systems.
If you have any suggestions, feel free to leave them in the comments below.