• twitter
  • Contact Us
  • This email address is being protected from spambots. You need JavaScript enabled to view it.

Comparing 1010data and Hadoop (and RDBMS)

It is tricky to directly compare 1010data with Hadoop since it is an apples and oranges comparison. 1010data is a complete solution while Hadoop is merely one relatively small component of a solution. Hadoop is a data management engine, while a complete solution must also include database development and management, user interfaces, analytics, and so on. To use an analogy, if 1010data is FedEx, Hadoop is a truck motor. The two are unlikely to be viewed as competing options.

A more meaningful line of inquiry is to compare 1010data with a complete Hadoop-based solution, i.e. a system that uses Hadoop as its database engine but that also includes all the other components that comprise a complete solution. Let's then do this, and while we're at it, throw in relational database managers (RDBMS) as well. Green is good, red bad, orange somewhere in the middle.

1010data

Hadoop-Based System

RDBMS1-Based System

Effort (Implementation and Maintenance) green red red
Affordability (TCO) green   red 2 red
Time to Value green red red
Simplified Data Integration green red red
Spreadsheet-Like Interface
green red red
Query Speed green   red 3 yellow
Advanced Analytics green   red 4 yellow
Data Mash Ups green red red
Data Monetization green red red
Scalability green green red
24/7 Availability green green red
Document Searches
red green red
Real-Time Updates (OLTP)   red 5 green green

Notes

  1. Covers software RDBMS, database appliances, in-memory databases, and columnar databases. These alternative technologies have different cost and performance characteristics, but the differences are small relative to this analysis.
  2. In theory Hadoop is open source and therefore free, but most companies will opt to purchase a supported version. More importantly, there is a costly human effort involved and there are significant charges for other system components.
  3. In the case of all three options, and especially Hadoop, performance is highly dependent on the number of nodes. For this chart we assume that the number of nodes in each of the three solutions are within, say, an order of magnitude of each other.
  4. With enough time and effort, Hadoop can be made to support advanced analytics, however the cost is usually prohibitive, ease-of-use is extremely limited, and performance will always lag other options.
  5. 1010data is an analytical platform meant for analysis and operational reporting. It is not designed for transaction processing.