Data plays a vital role in the working of any application currently in use. How we store, process, and retrieve data solely determines the usefulness and efficiency of the application.
Data management is not just storing and retrieving data on demand. It paves the way for the following functions that drive today’s internet.
Data persistence
Big data analytics
Building machine learning models
Predictive analysis
User behavioural analysis for targeted ads
These are just a subset of what we can do with efficient data management.
In this week’s post, we’ll discuss the two most common types of databases, SQL and NoSQL.
SQL databases
SQL databases, also called Relational Databases, store data in organised tabular formats consisting of rows and columns.
Relational Database Management Systems (RDBMS) have a predefined schema of columns (attributes) and rows (records) and have a strict structure.
SQL (Structured Query Language) is the core of relational databases as it is used to communicate with them.
Relationship between multiple tables is established using Foreign Keys (FK), with which we can join multiple tables while querying.
SQL databases are vertically scaled. Vertical scaling refers to adding more power to the existing database server (CPU/Memory/Speed).
We’ll take an example of a simple SQL database storing user data.
Let us analyse all the components shown in the above SQL database.
We have 2 tables — USER table and ADDRESS Table.
The USER table has 5 columns (attributes) — id (Primary key), first_name, last_name, address_id, favorite_food.
The ADDRESS table has 5 columns — address_id (Primary key), line1, line2, city, zipcode.
To avoid over populating the USER table with multiple columns, I’m separating the address from the user to a separate table and mapping it using the address_id (Foreign Key).
Benefits of SQL databases
Scalability:
SQL databases provide excellent performance for relational data. It can be vertically scaled if required.
Horizontally scaling SQL databases might be a challenging, and it might require downtime.Performance:
SQL queries are quick and efficient. Data insertion, deletion, retrieval happens in no time.
The querying process can further be increased by adding indexes to data fields to improve table joins. We’ll discuss this topic in detail on a separate post.Security:
SQL databases are ACID compliant and reinforces data integrity during insert operations.ACID stands for Atomicity, Consistency, Isolation, Durability.
Disadvantages of SQL
Querying might be slower if the data size becomes huge.
Adding a new column to an existing table might be tedious and requires table locking.
Horizontal scaling requires data structure changes and additional engineering efforts.
Where can we use SQL databases?
Choosing a database for your application depends on the developer and the problems that the application solves.
Relational databases are used in the following cases:
When data consistency is vital and relational complexity is high.
Financial transaction applications where data integrity is key.
E-commerce inventory applications that require increased data consistency and transactional rollbacks.
Ticket booking applications where database locking and concurrency is required.
Examples of SQL databases
Some of the popular SQL databases are
MySQL
PostgreSQL
OracleDB
MariaDB
MSSQL
NoSQL databases
NoSQL databases are non-relational databases that uses different data models for storing, managing, and accessing data.
The different types of data models are
Document-oriented:
Data is stored, managed, and retrieved in JSON, BSON, or XML format.
Key-value:
Data is represented as a collection of key-value pairs where keys are unique strings.Graph:
Data is represented as nodes and edges where nodes are data points and edges are the relationship between nodes.
Column-based:
Data stored in tabular format with flexible columns. This means that the data can vary from row to row in the same table.
NoSQL databases are not limited to a table structure and provide flexible data storage functionalities.
NoSQL databases allow the storage of unstructured data such as texts, photos, videos, PDF files, and other formats. Data querying is simple and efficient.
I’ll take the example of user data and represent the same using document-oriented NoSQL DB.
Let us analyse all the components shown in the above collection.
This is the USERS collection, where each user is added as a document with a unique ID.
Note that the address field is nested within each record itself.
Document-oriented NoSQL databases can have nested documents supporting multi-level data storage.
With this structure we will not have to perform joins to retrieve data. If I require the user information for “Dwight”, I can query using the unique ID, and I will get the entire document in the result.
Benefits of NoSQL databases
Scalability:
NoSQL DBs are horizontally scaled. This means if a server has reached its limit, we add another server.
Each server will share data between them which will reduce the number of requests per second in each server.
Performance:
NoSQL databases are known for high availability and performance. They are distributed which reduces the load on servers allowing huge number of users to access the data simultaneously.
Quick updates:
Preparations for initial deployment is almost zero with NoSQL DBs. Changing the data structure of a collection is quick and locking is not required, unlike relational databases.
Disadvantages
Security:
NoSQL databases provide weak security and are not ACID compliant like relational databases. Even if ACID properties are incorporated, they’ll be specific for a data partition.
Dynamic Schema:
Although this is an advantage, a dynamic schema might cause some confusion while development. Each document might have its own structure containing different fields.
This has to be maintained properly to avoid confusion.
Where can you use NoSQL databases?
NoSQL databases can be used in applications where high availability and performance is mandatory, with some lesser data consistency.
Large e-commerce applications handling millions of requests might go with NoSQL databases due to its high performance.
Data stored for metrics and analysis can use NoSQL databases.
Examples of NoSQL databases
MongoDB
Cassandra
Redis
ElasticSearch
Which is better? SQL or NoSQL?
The statistics shown above shows the popularity of the databases determined by StackOverFlow’s survey in 2021.
We cannot determine which is better among these. It depends on the problem solved by the database and how the stored data will be used in the future.
However, it is essential to know the properties of both types of databases so you can choose depending on your requirement.
This is why database modeling is essential before implementing a database in your applications. This will provide a better perspective of how you want to store your data and what data retrieval tactics would be required in the future. Check out my previous post discussing what database modeling is.
Resources for you to dive deeper
NoSQL Data models: https://www.mongodb.com/scale/types-of-nosql-databases
Choose DB type based on your need: https://www.mend.io/resources/blog/when-to-consider-a-nosql-vs-relational-database/
Here’s Gaurav Sen’s video where he explains NoSQL databases and the difference. For more information about system design and database architecture, follow him. His videos are amazing!
If you’re a developer, you would have come across the term “Micro-service architecture”.
What does that mean? How is it different from the other types of architecture? When should I opt for the micro-service architecture?
I’ll cover these topics in next week’s post.
Until next time!