top of page
Swapna Lakhpati

SQL for Data Analysis-Part 1

Data is the most important resource for today’s companies. But, like other resources, it can’t be used to create business insights in its raw form; it has to be processed and structured for analytics before it creates real value. Data analytics is the process of getting a company’s data into a format that allows it to be usable in creating business insights and recommending actions.

SQL for data analysis refers to the database querying language’s ability to interact with multiple databases at once, as well as its use of relational databases. SQL is one of the most commonly used and flexible languages, as it combines a surprisingly accessible learning curve with a complex depth that lets users create advanced tools and dashboards for data analytics.

Another way many use SQL data analytics is by integrating them directly into other frameworks, offering additional functionality and communication abilities without having to build entire structures from scratch. Indeed, SQL analysis can be used within languages like Python, Scala, and Hadoop, three of the most popular currently in use for data science along with big data management and manipulation.

Integration of SQL server with Analytical tools like Tableau, PowerBI, Jupyter notebook is easily possible.

SQL is short for Structured Query Language. Originally, it used to be called SEQUEL (Structured English Query Language) and was used for storing and manipulating data in databases. Today SQL is used to perform all types of data operations in relational database management systems (RDBMS).

SQL is a powerful language where we can perform a wide range of operations:

  • execute queries

  • fetch data

  • insert, update, and delete records in a database (DML operations)

  • create new objects in a database (DDL operations)

  • set permissions on tables, procedures, functions, and views

  • and much, much more...

SQL comprises 5 sublanguages:

  • Data Definition Language (DDL): Used to define the Database schema.

  • Data Manipulation Language (DML): Used to modify the database.

  • Data Query Language (DQL): Used for performing queries on the data.

  • Data Control Language (DCL): Used to grant and revoke user authority on the Database.

  • Transaction Control Language (TCL): Used to manage Database transactions.


Data Types in SQL



A. Data Definition Language (DDL)

By using DDL we can change the structure of our tables. Since all the command of DDL are auto committed it permanently saves all the changes in the database.

1. CREATE

Let's suppose you want to create a table with information about customer ,below code allows you to create a new database or table.

CREATE TABLE Persons (

PersonID int,

LastName varchar(255),

FirstName varchar(255),

Address varchar(255),

City varchar(255)

);

To create temporary table from existing table .Here I am Using DVD rental dataset.

CREATE TABLE TestTable1 AS

SELECT title,rental_rate

FROM film

order by rental_rate desc

limit 10;


2. ALTER

Now let’s say you forgot to add phone number to table . The next command is used to modify (add, drop, rename, etc.) the structure of the data in your database, it should be noted the data will remain unchanged.


Alter table Persons

add Phone_Number text;

If you want to change the name of new column, "Phone_Number" you can write below query .

though our column name was Phone_Number, Postgres also accept phone_number. As it is case insensitive.

PostgreSQL case insensitive is defined as searching with considering as the SQL select queries and the regular expression in PostgreSQL.

ALTER TABLE Persons

RENAME phone_number TO contact_number;


3. DROP

So you’ve decided that the table you created no longer serves a purpose and you want to delete it. The DROP command deletes a database or table. Before running this command you should be aware that it will delete your whole table, including all data, indexes, and more, so make sure you are certain before you run.

--To delete a specific column we can combine the ALTER TABLE command with the DROP command.

ALTER TABLE Persons

DROP COLUMN contact_number;

--To delete complete table

Drop Table Persons;


4. TRUNCATE

This command is used to remove all data entries from a table in a database while keeping the table and structure in place.

TRUNCATE TABLE Persons;

B. Data Manipulation Language (DML)

By using DML we can modify, retrieve, delete and update the data in our database.

1. INSERT

Suppose we want to add to our database, to add this new record we use the INSERT INTO command. This command allows you to add one or more rows. Here we are adding two rows.

I didn't executed "Drop "and "Truncate" commands previously that's why I am using the same table other wise you can change the table name and proceed for inserting values.

If you run the below queries multiple time , same data will be added over and over.

Insert into Persons (PersonID,LastName,FirstName,Address,City,contact_number1)

Values(1,'lee','Tom','Luxury Villa','Charlotte',+15516645454);

Insert into Persons (PersonID,LastName,FirstName,Address,City,contact_number1)

values(2,'white','Jo','Rose Villa','Edison',+15556667777);


2. DELETE

The new listings pulls out last minute so we need to remove the data from our table. To remove data we can simply use the DELETE command based on conditions specified with the WHERE command.

(To Execute this query I created the same table again and inserted values. You just need to run the code again)

Delete from Persons

where PersonID=2

3. UPDATE

Imagine you need to update data in your table because Tom moved to other place. The UPDATE command allows you to do this based on conditions specified after the WHERE command.

We can update multiple columns in SQL using the UPDATE command.

Syntax: UPDATE table_name SET column1 = value1, column2 = value2, ...


UPDATE Persons

set address = 'RoseLand street',City = 'Prinston'

WHERE PersonID = 1



C. Transaction Control Language (TCL)

TCL commands are used to maintain consistency of our databases and for management of transaction made by DML commands. We can only use TCL commands with DML commands like INSERT, DELETE and UPDATE.


A Transaction is a set of SQL statements that are executed on the data stored in DBMS. Whenever any transaction is made these transactions are temporarily happen in database. So to make the changes permanent, we use TCL commands.

The TCL commands are:

1.COMMIT

This command is used to save the data permanently.

Whenever we perform any of the DML command like -INSERT, DELETE or UPDATE, these can be rollback if the data is not stored permanently. So in order to be at the safer side COMMIT command is used.

commit;

2. ROLLBACK

This command is used to get the data or restore the data to the last savepoint or last committed state. If due to some reasons the data inserted, deleted or updated is not correct, you can rollback the data to a particular savepoint or if savepoint is not done, then to the last committed state.

ROLLBACK;

3.SAVEPOINT

This command is used to save the data at a particular point temporarily, so that whenever needed can be rollback to that particular point.

SAVEPOINT savepoint_name;


D. Data Control Language (DCL)

By using DCL we can permit a user to access, modify or work on the different privileges in order to control the database.

1. GRANT

Suppose you hire a freelancer to update your student database with new information. By using the GRANT command you can give the user access to database objects such as tables, views or the database itself. The below example gives the user named ‘User_Renee’ SELECT and UPDATE access on the student table.

Note-To execute this command you need to add this user to server.

GRANT SELECT, UPDATE ON student TO user_Renee;

2. REVOKE

After the freelancer has completed her work you now want to remove the users permission. You can do so by using the REVOKE command.

REVOKE SELECT, UPDATE ON student FROM user_Renee;

E. Data Query Language (DQL)

By using DQL we can fetch data from the database.

1. SELECT

You use the SELECT command almost every time you query data with SQL. It allows you to define what data you want your query to return. By using the SELECT command with an asterisk () all of the columns in the table student* are returned.

select * from student;

select Marks from student;

Note: Data analyst barely need DCL,TCL,DDL,DML commands. But DDL,DML are basic commands we need to learn.

Remaining part Aggregate Functions, Filtering Data, Combining Data will be continued in next blog posts.

Thank you...


113 views

Recent Posts

See All
bottom of page