Open Forem

Cover image for SQL joins and window functions
Kelvin Vosky
Kelvin Vosky

Posted on

SQL joins and window functions

Joins and Window Functions

Sql joins

When working with relational databases, data is usually in different tables .Join functions help in combining 2 or more tables based on a related column between them usually a primary key.
Types of Joins

  1. Inner Join
  2. Left Join
  3. Right Join
  4. Full join
  5. Self Join

Inner Join
Combines rows that have matching values in both tables. Inner joins only show values that are not null.
Example: Suppose we have a customers table with columns (customer_id, first_name, last_name, membership_status) and a sales table with columns (sale_id, customer_id, product_id, sale_date, total_amount). To find customers who made sales and their total purchase amounts:

inner join

This returns only customers with matching sales records.

Left Join
A left join Returns all rows from the left table, and only the matched rows from the right table. If there's no match the right table returns null.
Example: Using the customers and sales tables, to list all customers and their sales, even if they have no sales:

left join
_Returns all customers. For those without sales, sale_id and total_amount show NULL.

Right Join
Opposite of the left join, Returns all rows from the right table and matching rows from the left table. If no match, left table columns show NULL.
Example: Using the sales and inventory tables (inventory has product_id, stock_quantity), to list all sales and their stock, even if some sales have no current inventory match:

Right join
This includes all inventory records. For unmatched sales, stock_quantity shows NULL.

Full Join(Outer)
Returns all rows from both tables, with NULLs in places where there is no match. It combines Left and Right Joins.
Example: Using the products table (product_id, product_name) and sales table, to list all products and sales, including unsold products and sales without product matches:

full join

This includes all products and all sales, with NULLs for unmatched rows.(returns everything including null values from both tables)
Self Join
Self join combines a table on itself using aliases.
SQL aliases are used to give a column or a table a temporary name.
Aliases are used to make column names more readable OR Easy to remember.
The syntax for a self join table would be
SELECT column_list
FROM table alias1
INNER JOIN table alias2
ON alias1.common_column = alias2.common_column;

Example Example: In the customers table, to find pairs of customers with the same membership_status:

Self join

This pairs customers sharing the same status without duplicates.
we also have natural joins and cross joins but they aren't commonly used

Window functions
SQL window functions allow performing calculations across a set of rows that are related to the current row, without collapsing the result into a single value. They are commonly used for tasks like
Running totals (e.g., cumulative sales over time).
Rankings (e.g., top customers by spending).
Moving averages (e.g., average stock over months).
Year-over-year comparisons.
Window functions fall into three main categories: aggregate, ranking, and value
1)Aggregate window functions
Window functions are used with aggregate functions eg, sum(),avg(),Count,Max and Min,
Example; To find the Average stock quantity across all products

window fxns

2. Ranking Window Functions
These assign ranks or positions: ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(), PERCENT_RANK(), CUME_DIST().

ROW_NUMBER(): Unique sequential number per row in the window (no ties).

Example: Number of sales per customer by date.

Row no

RANK(): Rank with gaps for ties (e.g., 1, 2, 2, 4).

Example: Rank products by total sales (join sales and products).

Rank
DENSE_RANK() :Return product ranks without gaps for ties, (i.e 1,2,2,3,4,4)
NTILE(n): Divides rows into n brackets (e.g., quartiles).or percentage groups
Example: Divide customers into 4 groups by registration date.

n-tile
PERCENT_RANK() and CUME_DIST(): Percentile-based ranks (0-1 scale).
Example: Percent rank of sales amounts.

Percent rank of sale amount

Give percentages ranks with highest as 1 and lowest as zero with reference to price
3. Value (Analytic) Window Functions
Used to return values from other rows
Includes LEAD(), LAG(), FIRST_VALUE(), LAST_VALUE(), NTH_VALUE().
1)LEAD(column, offset): Value from next row (offset=1 default).
Shows the next value from a column ,,
Example: Compare each sale to the next for a customer.

lead function
for my data the customers did not have subsequent sales , lead would have brought the next sale associated with the customer in the event another sale would have been made from the same customer.
2)LAG(column, offset): Value from previous row.
This function allows you to access the quantity of the previous row for
each customer.
LEAD() can be used to compare a customer's orders across different time periods, such as
comparing the quantity of products purchased this month with the next month.
LAG() is useful when analyzing trends or behavior changes over time, like comparing how much a customer ordered in the previous period to their current order.

Which window function do you think is useful for your sql needs,let's meet and connect in the comments

Top comments (0)