Technology

Trillion-Row Spreadsheet℠

Business users like to get “hands on” with their data. For small, simple data sets, no tool is more popular for hands-on access to data, analytics, and modeling capabilities than the spreadsheet. For big data and situations where users need to collaborate with others to perform analysis, spreadsheets just don’t work. 

Or at least they didn’t work – until 1010data. The Trillion-Row Spreadsheet℠ lets you perform big data discovery and analytics in a spreadsheet environment. It’s a breath of fresh air for business users who wish their enterprise analytics systems could be just like Excel. Now they are.

 

NYSE demo from 1010data on Vimeo.

When working with big data on the Trillion-Row Spreadsheet℠, you can

  • See your actual detailed data and results, on-the-fly.
  • Perform unguided, impromptu analysis without restrictions
  • Use advanced visualization to glean insight at a glance
  • Upload and integrate new data whenever you like

Unlike traditional spreadsheets, there’s no limit to the size and complexity of data you can analyze. And when it comes to sharing and collaborating with other users, you’re not on an island. The Trillion-Row Spreadsheet℠ is centrally managed within the 1010data platform – you can allow other users to leverage your work and your calculations so you’re always on the same page with a single version of the truth. 

What do we mean by "spreadsheet"?

Essentially a spreadsheet offers a unique analytical experience that combines three important elements:

  • It is visual:  The user sees the actual detailed data and results.
  • It is interactive:  The user performs incremental operations and gets incremental results. You "run" application programs, you "submit" queries to a database, but in a spreadsheet you don't "run" or "submit" anything.  You do things.  You interact with the data in a much more direct and natural way.
  • It is unrestricted:  The user is allowed to perform unguided, impromptu analysis. Spreadsheets are not designed for particular applications; they are a blank slate that the user can leverage as he or she sees fit.

None of these things characterize the standard data warehouse reporting layer. They all characterize the 1010data Trillion-Row Spreadsheet.

Is the Trillion-Row Spreadsheet℠ a desktop application?

While you don't actually install any software on your computer, you do work with it on your desktop or notebook in the same way as you work with more familiar spreadsheets. In our case the spreadsheet appears within the browser.

Is the Trillion-Row Spreadsheet℠ a stand-alone application?

Conventional spreadsheets are stand-alone applications. They are meant as self-contained environments where users do their thing but no one connects into them, i.e. other applications don't directly invoke the spreadsheet's calculation engine or pull data from the spreadsheet. Data warehouses, on the other hand, are exactly the opposite: External applications must connect to them in order for users to gain anything other than low-level access to the data. The Trillion-Row Spreadsheet℠ combines both ideas. From an end-user's perspective, the Trillion-Row Spreadsheet℠ behaves like a stand-alone application, but external applications can also query it in much the same way as they would query a conventional database.

How does the Trillion-Row Spreadsheet℠ enable companies to easily share data and collaborate?

Often, the organization that owns data and the organization that wishes to analyze it are in different businesses. For example, a company may collect various kinds of market or demographic data and the various market participants may wish to use that data; the former is in the business of collecting data while the latter may trade securities. Or a retailer may wish to share its point-of-sale data with its suppliers. Here too they are in different businesses: The retailer sells goods to consumers while the manufacturer makes the goods. So the data owner and the data user to some degree live in different worlds and have different interests and perspectives.

Now suppose that Company A owns data and wishes to make it available to Company B. The first problem is that some sort of system must be built to allow Company B to reach into Company A's database. This is difficult just from a technology "plumbing" perspective. But the really hard part is figuring out how Company B should be allowed to query the data. Neither company can really do this; Company A does not understand the analysis that Company B would like to perform; Company B does not have access to the data yet. And getting a third party involved just makes things worse.

Enter the Trillion-Row Spreadsheet℠. With 1010data, the company that owns the data simply loads it onto our platform and all users of that data can analyze it however they like. The Cloud eliminates the plumbing problem and the spreadsheet interface solves the "different worlds" problem.

How does data get into the Trillion-Row Spreadsheet℠?

There are multiple ways of loading data onto the 1010data platform. Users can upload a fair amount of data directly through the spreadsheet interface itself. For larger amount of data, 1010data's PowerLoader application may be used or 1010data personnel can load and update the data on behalf of the customer.

Please note that, with 1010data, there is no need for complex and time-consuming relational modeling, database design, data cleansing, denormalization or pre-aggregation. Raw data may be loaded directly onto the platform and analyzed within the spreadsheet. This means that the time to delivery is extremely short and there are none of the costs or risks of a data warehouse implementation. It also means that users have access to all the data and may perform new kinds of analyses.

What enables 1010data's Trillion-Row Spreadsheet℠ to handle so much data?

That is of course the big question. And here's the answer: Traditional spreadsheets installed on your PC are limited in several ways.

There is only so much disk space on your computer. Most personal computers simply cannot hold anything close to trillion rows of data.

There is only so much processing power and memory in your computer. The modern PC is amazingly powerful, but a trillion rows is even more amazingly big.

Spreadsheet software is not robust enough to handle Big Data. Conventional spreadsheets have hard-wired limitations on the number of rows, columns or both, and even if these restrictions were lifted, the software would either crash or be unimaginably slow for large amounts of data.

So how do we overcome these limitations?

In our case the data resides in the Cloud, not on your computer, and the Cloud can be as big as we want it to be. Think about it, when you do a web search, it's as if the Google or Yahoo databases—and the entire contents of the web - were on your computer. You don't really care where it is physically, all you care about is what you see on your computer, and there is only so much that you can see at a time.

The Cloud also solves the processing power and RAM problem. Your PC may have limited power but the servers out in the Cloud can have a lot more.

Finally, what about the software? Well that's the one we particularly like to talk about. Unlike standard spreadsheet software, our software is incredibly powerful and capable of ripping through hundreds of billions of rows of data in no time. In fact, it is the fastest database software in the world. Period.

Which browsers are supported?

We recommend Google Chrome or Mozilla Firefox. We support the current and previous major releases of Chrome, Firefox, and Safari on a rolling basis. Each time a new version is released, we begin supporting that version and stop supporting the third most recent version. We support IE 8, 9 and 10 as well.