Cassandra is a distributed database management software that is open source and has a broad column store, NoSQL database to handle huge amounts of data across numerous commodity servers, which offers high reliability and has one source of failure. The system is developed in Java and was developed by the Apache Software Foundation.
Avinash Lakshman & Prashant Malik originally developed Cassandra in Facebook to help power inbox search. Facebook the inbox feature for searching. Facebook launched Cassandra as an open-source project on Google software in the month of July. In March 2009, it became an Apache Incubator Project and in February 2010 , it became an official project. Because of its exceptional technical characteristics, Cassandra is a huge hit.
Apache Cassandra is used to manage huge amounts of structure data distributed all over the world. It is a highly-available service that has one point of failure. Here are a few of the advantages to consider about Apache Cassandra:
It can be scaled it is fault-tolerant, reliable, and constant.
It is a column-oriented database.
The distributed design of the system is inspired by the model of Amazon’s Dynamo as well as its model for data is based on the Google Big table.
It was created by Facebook and is quite different from traditional database management systems.
Cassandra utilizes a Dynamo-style replicate model that does not have a single failure point, however it also has a stronger “column family” data model. Cassandra is utilized for a number of most renowned corporations like Facebook, Twitter, Cisco, Rackspace, eBay, Netflix and many others.
The goal of design of Cassandra is to handle large data-intensive workloads that span many nodes with no single source of failure. The Cassandra client features a peer-to peer distributed system that spans its nodes. Data is distributed across all nodes in the cluster.
All nodes of Cassandra within a cluster perform the same function. Each node is distinct, but however, it is also connected by other nodes. Every node in the cluster is able to accept writes and read requests regardless of the location where the information actually in the cluster. When a node fails the request for read or write can be handled by other nodes of the network.
The characteristics of Cassandra:
Cassandra is now a cult favorite due to its technological characteristics. Here are a few characteristics of Cassandra:
Easy data distribution –
It lets you transfer data to wherever you require by distributing the data across multiple data centers.
Examples:
If there are five nodes, such as N1, N2 3 5, N4, and using a partitioning algorithm, we will determine the range of tokens and distribute data in accordance with that. Each node will have a distinct token range within which data is distributed.
Flexible data storage –
Cassandra can handle all data formats such as semi-structured, structured, as well as unstructured. It can adapt to your data structures according to your requirements.
Scalability elastic is a
Cassandra is extremely flexible and can be expanded to include additional hardware in order to accommodate many more customers and to store more data according to the requirements.
Fast write-ups Fast writes
Cassandra was developed to run on low-cost common hardware. Cassandra is a lightning fast write and is able to store hundreds of terabytes worth of datawithout sacrificing efficiency of reading.
Always on Architecture
Cassandra is a non-stop source of failure, and is available to business critical applications that aren’t able to afford a loss.
Fast linear-scale performance –
Cassandra is scalable linearly, so it improves your performance by increasing your number of devices within the cluster. It maintains a quick response time.
Support for transactions Support for transactions
Cassandra has properties such as Atomicity Isolation, Consistency and the Durability (ACID) characteristics of transactions.