data silo problem

Solving Data Silos: The Modern Solution

As AI takes centerstage, data silos are the biggest hurdle that every enterprise architect, CIO, CTO, or business leader wants to solve today.

We need access to quality & timely data, which in turn will lead to better analytics and straight through processing of transactions.

So far, a solution to solve the problems of data silos has been elusive.

In this blog I’ll present a new industry solution that is emerging to solve this problem.

Note: The language in this article is intentionally non-technical so that we don’t get lost in the jargon.

Why Data Silos?

70% of all our budget goes towards mitigating various issues with data silos.

  1. When an application needs to send data to another system, a copy of the data is created at both ends (yes, even with APIs and middle-wares) because both applications typically have their own databases.
  2. When one company wants to send data to another company, a copy is created because they have different databases

It’s not feasible to have a single database for everyone.

Naturally, these copies of data (or data silos) are frequently out of sync and delayed.

These issues result in major real-world customer pain points:

  1. Your money is withdrawn but you you don’t have the goods/shares yet
  2. Our credit score information takes weeks to update on the credit bureaus
  3. Health insurance claims take weeks or months to clear through the systems – and no one has any idea what was in the claim, what wasn’t, and why?
  4. We end up giving our SSN and other private information to hundreds of companies because everyone needs to do KYC on us – even different parts of the same company

A big chunk of our budget goes towards trying to hide, or overcome, these latencies.

In addition, for any AI or analytics project, data from multiple places needs to be aggregated. This is expensive.

Frankly, most of us have learnt to accept and deal with all these limitations.

Because, there has been no real technology solution so far.

How to Solve this Problem?

Other than aggregating everything into cloud data lakes, or creating monolithic applications, there hasn’t really been a solution.

One promising candidate solution was blockchain. It was positioned as a golden source of data  which also doesn’t need a middleman.

In some cases it works. The industry has spent a few years working on blockchain to solve these data silo problems.

However, blockchains are not ready for the enterprise because they have:

  1. Severe privacy issues (everyone sees everything)
  2. They are not easily interoperable. As a result, if two companies want to work together they need to use the same blockchain.
  3. In addition, the way transactions are committed (aka consensus by proof of work or proof of stake etc.) means that the transactions are not 100% deterministic.

But, now there is a solution that sits one level above our databases and blockchains, and promises to solve these problems.

The Canton Network Solution

This modern solution uses 2 key components:

  1. Daml – an easy to use language to create multi-party workflows and apps
  2. Canton – a distributed transaction controller that works over multiple independent blockchain or other networks

As a result, all parties – whether different companies, or different applications, now can have access to the same golden source of data – in real time, without giving up privacy.

In fact, a pilot done on Canton between 155 participants from 45 major financial services organizations demonstrated settlement across 22 permissioned blockchains connected on Canton while maintaining the controls demanded for in regulated capital markets.

You can see the architecture of such a solution in this report published on the above pilot.

Here are some of the characteristics of the technology:

Is it a YABN (yet another blockchain network)?

Canton is not a traditional blockchain network. Instead it’s a combination of an advanced multi-party synchronization protocol and multiple independent networks that are connected together as needed.

This combination ensures that each transaction between the designated parties is securely and correctly processed and there are no data silos.

For example, multiple banks and exchanges (network 1) can share their trade and settlement information in real time with each other, while also being able to atomically interact with another set of banks and exchanges (network 2). 

Their interparty communications happen via Canton. 

Its similar to how we can all have our own websites, but communication happens over https.

Can Canton maintain privacy of data between parties?

Each party can transact without having to give up their privacy requirements.

This would mean that if Transaction AB from Bank A is meant for Bank B, then only banks A and B would have access to it.

If a Bank C is on the network, they wouldn’t know the details of what is happening between Bank A and B.

Compare that to a traditional blockchain system where everyone would have to have access to everything because various complex sounding things (consensus) have to be done to ensure the sanctity of the system.

One of the key underlying foundational tenets for Canton is privacy.

How does it enable access to real time data?

Canton, takes all the plumbing of reconciliation and pushes it under the hood.

It’s just like today you don’t need to worry about how your browser is able to know where www.manishgrover.com is. You just type it in, and it gets the blogs and assessments for you.

Same way, Canton ensures that any transaction data is automatically and reliably distributed to the right parties in real time. There are failovers available in case one company’s system is down.

What programming languages do I have to use?

Most of your apps can remain the same. However, when you need to interconnect with another party, you use Daml, a specialized multi-party, object-oriented, workflow creation language.

Daml allows workflow transactions to be written very simply and very efficiently.

All you have to do is to create objects which contain the data and the recipients who should have access to them. You send that object over using Daml, and Canton does the rest!

Each party (e.g. Bank or Hospital) will have a private transaction store which stores all the Daml objects. So that’s how you can get access to your historical information without having to worry about peering into someone else’s data.

Only the objects where you are a recipient are stored in your datastore, and they are internally synced with all other stores with other banks to make sure that the data is consistent.

Next Steps: A pilot for your use case

Canton is an ideal solution for when you have multi-party transactions that require you to ship data back and forth. It allows you to have a golden source of data without having to sync databases.

Canton’s website says they are focused on capital markets and payments right now, but you can probably use it for any industry domain.

Healthcare, logistics, supply chain, consumer banking, and hospitality come to mind. All of these have multiple companies working with each other and must address this data silo problem.

Head on over to the Canton website and Daml docs to check out the technology.

If you want to brainstorm such a solution, you can also connect with me for an initial no-pressure conversation. I know a thing or two about the technology because I worked at Digital Asset previously (the company that makes Canton and Daml). But if you really want deep technical expertise, then connect with the folks at Canton & Digital Asset.