Hi there! With the new look of my blog, I decided to talk about a different subject. Let’s jump straight into the today’s subject without any small talks. 😛
In the world of computer systems, there are large number of data comes out every day. A significant amount of those data is handled by Rational Database Management Systems (RDBMS) which is a predominant technology for storing structured data in computer systems.
There are different types of data;
- Structured Data : Information with the high degree of organization and readily searchable by simple,straight forward search operations.
- Unstructured Data : Information that is not organized in a pre-defined manner neither a pre-defined model.
- Semi-Structured Data : A cross between above two. Type of structured data. but lacks the strict data model structure.
As i said, the structured data is handled by RDBMS, what happens to the unstructured and semi-structured data then? What if your data requirements at the beginning are not clear and you are dealing with massive amounts of rapidly increasing unstructured data (time series data such as – IoT and device data, user and session data; chat, messaging, log data) ?
RDBMS are not designed to manage these type of data efficiently, and this is where NoSQL comes into the stage with the capability of handling huge amount of data properly.
What is NoSQL?
NoSQL means “Not only SQL” and it is an approach to database systems which is far away from traditional RDBMS. Unlike RDBMS, NoSQL satisfies the next-generation data-intensive application requirements which are performance, scalability and flexibility. NoSQL is very useful when it comes to storing unstructured data which is growing rapidly than structured data.
How NoSQL achieve those?
Instead of tables like in RDBMS. NoSQL is document-oriented. So the unstructured data is stored in a single document that can be easily found and it is not necessary to categorize into fields like in RDBMS. Not only the unstructured data, it can handle structured and semi-structured data also.
Developers who are working with object-oriented programming languages such as Java, PHP, C# and Python use APIs such as JPA, Hibernate, LINQ which allow developers to execute queries without having to learn SQL. NoSQL is natively object-oriented, easy to use and flexible. So it sidestep by using APIs.
NoSQL provides horizontal scaling rather than vertical. It’s cost-effective when comparing with RDBMS which is vertically scalable by increasing server hardware power and bigger servers.
Types of NoSQL Databases
- Key-Value Stores :
The most simple NoSQL option which is very useful in accelerating an application’s read-write speed and processing of non-transactional data. Stored values can be any type of binary object (text, video, JSON, etc.) and are accessed via a key.
Eg: Redis, Riak, Azure, DynamoDB
- Document Stores :
This uses the Key-Value as concept but complex. A value is a single document that stores all data related to a specific key. It’s own a unique key, which is used to retrieve. Popular fields in the document can be indexed to provide fast retrieval of data without knowing the key. Each document has same or different structure.
Eg: MongoDB, CouchDB
- Wide Column Stores :
Stores data as columns rather than rows and groups columns of related data together. A query can easily retrieve related data in a single operation because only the columns associated with the query are retrieved. In RDBMS, the data would be in different place on disk, requiring multiple operation for retrieval.
Eg: BigTable, Cassandra, SimpleDB
- Graph Stores :
Uses graph structures to store, map and query relationship. The adjacent elements are linked together without using an index. This method is the most complex one.
Eg: Neo4J, OrientDB, Polyglot
Think of a situation where your server-side application is developed to be fast and seamless, but you use RDBMS to handle data which will bottleneck the data. If you use NoSQL, it will prevent from data being bottleneck.
Think about an AI or a machine learning model. What will happen if you use RDBMS over NoSQL?
When it comes to Big Data, NoSQL is the hero doing the things that traditional RDBMS cannot.
A NoSQL database has no limit when it comes to storing types of data. You can store them together and allows to add new different types as you need. With document based database you will store data in one place without having to define the types.
In cloud computing, NoSQL is the best cost-saving approach. Some NoSQL databases are increasingly begin adopted to high availability and they use master-less database architecture that automatically distribute data equally among multiple resources. By automatically replicating over multiple data centers, distributed NoSQL databases can ensure a great application experience wherever users are located by reducing the latency. NoSQL databases like Cassandra are designed to be scaled across multiple data centers without any hassle.
BUT, there are some disadvantages also.
A major problem in NoSQL is lack of reporting tools. However, SQL have variety of reporting tools available.
The community is relevantly new when comparing with the community that SQL has. But NoSQL is rapidly growing.
NoSQL databases are becoming a major role in the database landscape. With the advantages like low-cost, performance, high availability and scalability, it is a great option for companies who integrate in big data. But, NoSQL is a still young technology and does not have standards that SQL databases like MySQL has. At the end of the day, the choice between SQL and NoSQL depends on the complexity, variety and volume of data that you are going to work with.