Expanso lands $7.5m seed investment to revolutionize distributed data processing

Expanso lands $7.5m seed investment to revolutionize distributed data processing

Published: 24-11-2023 09:41:00 | By: Pie Kamau | hits: 2436 | Tags:

Expanso, a startup built to help enterprises manage their ever growing data needs with a distributed approach to big data processing powered by its open-source software Bacalhau, has raised $7.5 million in seed funding led by General Catalyst and Hetz Ventures, along with Array Ventures.

Based out of Seattle, Expanso is co-founded by alums of Google, AWS, and Microsoft and will be focusing on open source solutions and targeting enterprises to address what CEO David Aronchick believes is currently an enormous but overlooked challenge: ''Actually making use of enterprise data''.

David Aronchick, CEO, Expanso: ''Infrastructure built to meet data where it is, even if distributed around the world, is long overdue. What Expanso is building with Bacalhau is intended to revolutionize the way big data is processed and global compute jobs are executed, while unlocking an entirely new class of applications. We’re excited to partner with General Catalyst, Hetz Ventures, and Array Ventures and use this funding to accelerate the development of Bacalhau and Expanso, and bring it to even more users.''

Distributed big data processing can be complex and challenging. One of the biggest challenges is dealing with the time and cost involved with transferring data between different nodes to a centralized data lake. This can make it difficult to be responsive to new data inflows in real time. Further, many platforms, while powerful, require converting existing code to new frameworks just to access the data, let alone get insights. And distributed big data processing systems are often a rich target for security issues, such as leaking personally identifiable information (PII), regulatory concerns, and data breaches.

The open-source software Bacalhau (bacalhau.org), developed and backed by Expanso, is built on the principle of "Compute Over Data" which means that it brings the processing jobs to where the data is, rather than moving the data to the cloud first.

Further, with Bacalhau, users can streamline their existing workflows without the need of extensive rewriting by running arbitrary Docker containers and WebAssembly (WASM) images as tasks. The software can run on-premises, or inside of any cloud including Amazon Web Services (AWS), Microsoft Azure, Google Cloud, Oracle Cloud, and many more.

Quentin Clark, Managing Director, General Catalyst: "Expanso brings compute to the data, enabling businesses to operate securely at their operational pace and maximize the utility of valuable data. In less than a year, Dave and his team of exceptional technologists and entrepreneurs, have achieved significant milestones, with the platform now in use with various sectors, including some of the world's largest defense organizations. We are proud to support Expanso as they work to enhance the impact of distributed data for businesses worldwide." 

Developers can use the tools they already know and enjoy using, like Python, R and Duck DB - with almost no changes. Nearly anything that can be containerized, can run on their network.

Jordan Tigani, CEO and Co-founder, MotherDuck: "A missing part of the modern data stack is the ability to process data where it is being created rather than have to centralize everything first. Bacalhau fills in that missing link, allowing large numbers of remote workers to use DuckDB to filter, summarize, and transform data at the edge before communicating results to MotherDuck in the cloud."

Bacalhau offers a free demo network which has been live for nearly six months. Since launching, their network has handled more than 1.5 million jobs for design partners like the University of Maryland, BOINC, New Atlantis Foundation, and many more.

www.expanso.io