What made me fall in love with Snowplow was that it was unopinionated, gave access to raw event data, and was truly open source. Back in 2013, that changed everything for me. I couldn’t look at GA the same way again.
Over the years, analytics moved into SQL warehouses driven by cheaper CPU/storage, dbt, reproducibility, and transparency. I saw the need for a democratized Snowplow pipeline and launched a hosted version in 2019.
In January 2024, Snowplow changed its license (SLULA), effectively ending open-source Snowplow by restricting production use. When that happened, I realized the spirit of open data and open architecture was gone.
A week later, I forked it, I wanted to keep the idea alive.
OpenSnowcat keeps the original collector and enricher under Apache 2.0 and stays fully compatible with existing Snowplow pipelines. We maintain it with regular patches, performance optimizations, and integrations with modern tools like Warpstream Bento for event processing/routing.
The goal is simple: keep open analytics open.
Would love to hear how others in the community think we can preserve openness in data infrastructure as “open source” becomes increasingly commercialized.
That's it, I should have posted here earlier but now felt right.