Balancing Openness and Profit: Zyte’s Journey in Open-Source Culture
A Conversation on Building Open-Source DNA in a Market-Driven World
I’ve worked for seven companies in fifteen years. I spent half of it with Zyte. The remote-first culture here feels natural, and no other place has attracted flocks of genuine and competent people as well as Zyte. I couldn’t put my finger on why—until recently. The secret, it turns out, is Zyte’s open-source DNA.
But in a tech landscape that often prizes competitive advantage and proprietary software, can open-source values survive? How can companies keep these ideals alive while staying profitable and competitive?
To explore these questions, I spoke with Zyte CEO and Co-Founder, Shane Evans, along with Zyte’s core open-source custodians Mikhail Khorobov and Adrian Chavez. They shared how Zyte maintains this balance, blending transparency and technical stewardship while strategically choosing where to remain proprietary. From Zyte’s start with Scrapy in 2010 to its enterprise solutions today, their insights offer a roadmap for companies navigating this nuanced territory.
What does it mean to have an ‘open-source culture’?
We may think we know what open source is, but it goes beyond free code—it’s a framework for creating technology in a collaborative, open, and diverse way.
Observing open source culture, we see five defining characteristics:
Asynchronous electronic collaboration: Open-source projects live almost entirely online, with forums, version control, and messaging boards enabling contributors across time zones to unite their efforts.
Distributed and decentralized structure: Power is decentralized; open-source projects are often grassroots, with contributors bringing diverse perspectives and skills.
Community-driven development: Unlike proprietary software, open-source projects evolve through community feedback and collective decision-making. This often leads to software that better serves the needs of its users—which and more often than not, are also its builders.
Meritocratic recognition: Respect is earned; contributors gain authority based on expertise and commitment.
Iterative improvement: Open-source software improves continuously, as anyone can propose improvements or fix issues directly in the codebase, enhancing responsiveness to user needs.
How Zyte embodies open-source values
At Zyte, these values are not abstract ideals but active principles infused in everything from product development to community engagement.
Product development
Zyte’s open-source journey began with Scrapy in 2010, and the company has maintained and expanded its commitment to the open-source web data extraction community ever since. Today, Zyte serves as the main steward of three GitHub organizations, collectively managing over 240 public repositories (42 public repositories under zytedata, 177 under scrapinghub, and 26 under scrapy). This community spirit shines through in Zyte’s collaborative efforts—some repositories are even maintained by non-Zyte contributors. For instance, the python-crfsuite library is primarily managed by DataMade, while the scrapyd project sees active contributions from Open Contracting Partnership.
For Mikhail—the lead open-source custodian at Zyte, one of the key technical decisions when open-sourcing tools is to design against vendor lock-in for the Zyte customers. People can use Scrapy without any of Zyte’s products. People who need to host their Scrapy spiders can use scrapyd without having to use Scrapy Cloud. All the code behind AI spiders are released as open source so that developers can run these spiders on their hardware and customize according to their needs. “We do it for various practical reasons, including better maintenance and easier reuse, but also we do it because it’s good and we can do it”, Mikhail concluded.
Beyond maintaining projects, Zyte also shares core tools and components used in the codebase of its products, such as price-parser, clean-html, and perhaps the most well-known out of it all, dateparser. The scope of this ecosystem is so extensive that some projects aren’t even known to every Zytan; One recent example is how Adrian was delighted to discover the “web-snap” project when Mikhail proposed it as one of the top projects to be pinned at the zytedata organization.
Product-led-growth
For Zyte, open-source isn’t just a philosophy; it’s a catalyst for growth. When developers adopt Scrapy, they’re not just using a tool—they’re beginning a journey with Zyte. Having firsthand experience with Zyte’s technical capabilities, developers are more likely to seek out advanced solutions from Zyte when they face scaling challenges.
This natural progression from open-source adoption to enterprise engagement is one of the key factors contributing to Zyte’s product-led growth. By letting users engage with Zyte’s tools freely, Zyte builds trust, encouraging users to explore Zyte’s commercial offerings without traditional sales pressure.
Responsiveness to regulation and change
Open-source practices give Zyte the agility to adapt quickly to changing regulations and anti-scraping measures, ensuring tools remain compliant and effective.
Shane underscores that compliance is not an afterthought. Regulatory and ethical standards are integral to Zyte’s products, which “bulletproofs” Zyte against legal and reputational risks. This proactive stance benefits Zyte, its clients, and the broader community by setting a standard for ethical and compliant data practices.
Talent attraction and retention
Zyte’s roots in open source naturally draw talent from its community. Many early contributors to Scrapy transitioned into employees, bringing both mission alignment and technical expertise. This culture continues to attract engineers who value transparency and merit-based recognition, fostering an environment that encourages experimentation and cross-functional collaborations.
We have a contingency of employees who have been at the company for 10+ years!
Community building beyond code
Beyond its open-source tools, Zyte fosters a robust community through platforms like Discord. Zyte’s Extract Data community hosts more than 9,000 contributors connect, share ideas, and solve web data extraction challenges.
The online community extends to real-world events, with Zyte sponsoring, speaking at, and engaging in key industry gatherings.
To further support this ecosystem, Zyte launched its own annual event, Extract Summit—an annual in-person event that brings together experts, enthusiasts, and, yes, even competitors come together to discuss trends, share practices, and showcase innovations.
Yet, Zyte’s approach to community building remains rooted in open-source values. As Shane puts it, “It’s about engaging with your community to solve real problems, not just for clout or PR points.”
This authentic approach helps Zyte avoid the pitfalls of superficial engagement and reinforces the trust Zyte has built within its community.
Strategic decisions: Why not open everything?
As Shane notes, open source is more than releasing code—it’s about building a community and encouraging active contributions.
Over time, Zyte has evolved, and with it, the choice to keep some offerings proprietary. "Not all projects suit this model," Shane notes.
He acknowledges that while open-source products can support profitable businesses through premium services, extra features, and enterprise support—as demonstrated by WordPress—deciding what to open-source at Zyte is a careful balance.
For example, Zyte’s early proxy management solution, as well as its current advanced API service, have value beyond the code alone; they rely on robust infrastructure and ongoing operational management. These services demand intensive upkeep that doesn’t easily translate to community contributions.
Shane identifies two main factors that guide Zyte’s decisions to keep some products proprietary:
Infrastructure and complexity: Infrastructure-heavy projects may not lend themselves well to open source models. Maintaining such complex systems in an open-source model isn’t just impractical—it’s commercially unsustainable. Unlike broader-reaching projects like Kubernetes, which solved a core problem that attracted wide community support, Zyte’s specialized tools address niche needs that don’t gather the same critical mass.
Strategic considerations: Web scraping operates in a world of constant adaptation. Techniques can be countered when made public, accelerating the “cat-and-mouse” cycle. For Zyte, this means balancing the community benefits of open source with the need to preserve its competitive edge.
“We open-sourced scrapyd, an alternative to Scrapy Cloud,” Shane reflects, “but it didn’t receive much external contribution. It showed us the difficulty of maintaining an open-source project without substantial traction or engagement.”
Similarly, other open-source projects like Splash and Portia are still available but no longer actively maintained as Zyte’s strategy shifted toward other robust, low-code visual scraping tools.
This calculated approach ensures Zyte can focus on maintaining meaningful, sustainable open-source projects without overextending resources.
Preparing for a transition: key steps and strategic questions
If your strategy includes building on an open-source foundation or you're considering open-sourcing parts of your solutions, make sure to address these two prerequisites before starting the journey:
Be clear and transparent about the why behind the decision. A well-articulated purpose and strategy will help you align stakeholders and rally them towards a shared vision.
Make sure you understand your product’s key value propositions and the dynamics around it. This knowledge helps you identify what aspects are best suited for open-sourcing and which should remain proprietary.
Here is a set questions to guide your decision:
Conclusion
In a tech world where open-source ideals frequently collide with commercial realities, Zyte’s journey demonstrates a more nuanced approach. Balancing open access with strategic business decisions, they’ve built a culture where open-source thrives alongside profitability.
Recent tension between open-source and proprietary models—from the legal saga between WordPress and WP Engine to OpenAI’s transition to “not-so-Open-AI” in the past year or so—have highlighted an industry shift. This showcases how challenging it is to maintain open-source commitments over time.
As companies explore their place in this spectrum, Zyte’s approach—keeping foundational tools open while innovating proprietary solutions—offers a viable alternative, where open-source principles coexist with competitive strength.