Fluree, the blockchain-enabled data platform, grew out of trends making sharing and collaborating on verifiable data possible.
Using a distributed ledger on top of a semantic graph database and the World Wide Web Consortium’s Semantic Web standards, Fluree tracks the provenance of data, enabling time travel to any point in its history, establishing trust and verifiability as a foundation to allow people to collaborate.
“Fluree was an opportunity to address the challenges we saw around data management, especially as it related to some of the new external requirements and pressures around data,” said Fluree co-founder Brian Platz. “We have seen some very interesting technologies emerge over the past five to 10 years, like blockchain, like the W3C semantic web standards, and really saw an opportunity to take some of these technologies and bring them together in a way that solves these emerging issues, and treat the data more strategically within an enterprise.”
Data-First Approach
Platz and serial entrepreneur Flip Filipowski launched the Winston-Salem, North Carolina-based company in 2011.
Fluree is taking a step back from traditional blockchain to make it more generic, then trying to further it to serve more use cases, Platz said.
Many blockchain technologies are focused around decentralized finance use cases, with simple data types like accounts and balances and wallets, he said. Rather than data types and the behavior around the data being predefined, it uses a regular database, so while you could track currencies, you also could use it to track invoices or suppliers or identities.
One of its projects is working with the U.S. Department of Education on using blockchain to provide secure sharing of educational credentials. It’s also under contract with the Department of Defense and U.S. Air Force to build a secure data-sharing platform to verify how information enters its systems and by whom. Wake Forest School of Medicine is using Fluree to integrate data from multiple health care devices to speed up critical analysis and decision-making.
Another difference from traditional blockchain, according to Platz, is that it’s optimized for data. Blockchain is not so optimized, so you have to store data off the chain “because it’s so bad at it,” he said.
“We’ve really flipped it around; it’s a very data-first approach to blockchain,” he said. And just as git enabled software developers to collaborate on source code, he refers to it as a “git for data” offering, providing an immutable ledger with the ability to see how things change over time, with a layer of interoperability and what the company calls SmartFunctions to provide tight control of permissions.
TerminusDB, Dolt and the open source Noms database are among the other offerings with their own “git for data” approach, although without blockchain.
Decoupled Architecture
There are two parts: FlureeDL, an immutable distributed ledger that is resistant to data tampering, and FlureeDB, a semantic graph database optimized to build applications on top of FlureeDL. Each part runs and scales independently.
In both systems, developers use the SmartFunctions logic to enforce custom read/write permissions and rules.
Fluree combines transactions into immutable time-stamped blocks and locks them using advanced cryptography for security and data integrity. A private-public key infrastructure provides access to authorized users. Permissions are embedded on the data tier rather than being stitched together with a web of APIs. They can be configured at the cell level to restrict who can view information according to their security clearance and role.
FlureeDL is a programmable ledger that validates every transaction along various checkpoints. For instance, every transaction is signed by submitting the user’s unique key; every transaction is checked against schema rules; SmartFunctions control permission rights; and each transaction hash is checked against prior transactions for possible duplicates to guard against man-in-the-middle attacks.
Every update is recorded in a block that includes the previous block’s hash and its own Secure Hash Algorithm SHA3-256.
It offers time travel to any point in history, audit trails and the ability to decentralize part or the entire application data with programmable rules and Raft or PBFT consensus mechanisms for smart governance.
In addition, companies can partition their data on multiple FlureeDL blockchains, each with their own governance rules and consensus properties. FlureeDB can then query every FlureeDL to which it has permission and retrieve results as a unified data set.
FlureeDL outputs immutable block data in W3C standardized RDF format, which enables structured and semi-structured data to be mixed and shared across different applications and databases.
When an application wants to make a change, its submits a transaction to a server in the FlureeDL network, which validates it against the applicable schema and rules, then produces RDF-formatted data that represents the deltas of the database for a valid transaction, along with associated metadata.
Fluree supports SQL, GraphQL and SPARQL, while its own FlureeQL can combine the graph query benefits of SPARQL and graph crawling capabilities of GraphQL.
“You can use any or all of them, you can intermix them to your heart’s content,” Platz said. “What FlureeQL does above any of the other querying interfaces, it’s actually expressed as JSON, which makes it really easy to work with.”
The semantic database FlureeDB natively understands relationships and time, useful for highly analytical aggregations and graph queries.
While most databases require a permission layer be built at the API layer of an application, permission logic in FlureeDB is embedded directly into the data as metadata, eliminating what the company considers an unnecessary integration layer.
Build-in Defense
Platz refers to the granular permissioning when he talks about giving data the ability to defend itself, and that the data adheres to an established standard.
“Everyone connecting to a Fluree instance has a personalized database basically generated for them. It’s very efficient, happens behind the scenes, but it only has the data in it that they have permission to,” he explained. “And this is what allows end-users to issue any query they want, because they only have data in their kind of virtual database that they can see.
“Normally, you would have to accomplish sharing data with external parties in a permissioned way, by building custom API. … Or you’d have to deploy those on an application server that’s enforcing all of these permissions. So we address this by allowing you to put these permissioning rules into the data tier. By doing this, you actually give your customers more flexibility. And you can do this without having to build a custom API and forcing them to build custom code to your custom API. It’s a better way of protecting data.”
Because the system understands relationships, it can eliminate the need for customization, he said.
“Our locking mechanism can implement logic. We call these SmartFunctions, and they’re stored with the data itself. What makes them more powerful than a typical kind of ‘permissioning’ paradigm is that, because it’s in the data and it’s part of the data, it can leverage the data in the rules. So you can dynamically have the ability to update this invoice in the supply chain system because you work for the company and have this role in the company, and the company is the one that issued the invoice. But you can’t update anyone else’s invoices,” he said.
He said Fluree can simplify the process of building distributed applications.
“Fluree is natively a database, so you can eliminate the need to store off-chain data somewhere else. It can natively store that information. It also exposes these query interfaces, GraphQL would be a common one to use, which something like React, the frontend web app platform, can talk natively to. So you can get to a situation where you could deploy this permission ledger on Fluree. That’s managing all these data, SmartFunctions, data defending itself, you can actually build a frontend UI that can connect directly to that blockchain. You don’t have to keep anything in sync, you didn’t have to integrate anything, you didn’t have to build any API, and you have a real-time frontend application. So it becomes very easy to build a distributed app or decentralized applications,” he said.
As one of its investors — the company has raised $7.5 million — Ray Rothrock, former managing partner of Venrock, said of the company:
“With the advent of Web 3.0 machine-to-machine real-time AI applications, cybersecurity approaches need to move away from only allowing stage-gate access to data to allowing multi-party, real-time access to data while still ensuring its integrity, and Fluree’s blockchain-based data platform does just that.”
Image by Gerd Altmann from Pixabay