AI For Public Good: Open-Source Is Not Enough

AI For Public Good: Open-Source Is Not Enough

The stories about companies strongly investing in corporate and social responsibility programs are not something new. Tech companies are no exception here, they strive to embrace the altruistic potential of their employees, contribute to the community, and build trust with their customer base. Open-source has become a big part of Corporate Social Responsibility and beyond. While some companies are marketing their open-source activity to build trust with users, others are centering their R&D processes and product strategy around open-source entirely. Why is that so? What are the benefits of going open-source? What are the risks, and what this all has to do with AI?

Well, what exactly is open-source? Open-source software (OSS) is a category of software in which source code is open to the public and is licensed in a way that allows everyone to study, change, and distribute the software to anyone and for any purpose. The main principle of OSS is peer production of source code, blueprints, and documentation which is done openly in public.

The concept of free sharing of technological innovation existed long before computers, the examples are scientific sharing in Philosophical Transactions, or even the cross-licensing agreements of the early automotive industry, as greatly described in The Car Culture book by James Flink.

In the early era of computing, most of the software was produced by researchers working in collaboration. Once US Commission on New Technological Uses of Copyrighted Works decided that software should become copyrightable, computer vendors and software companies began charging for licenses. For those who owned resources and IP, selling licenses to use their proprietary source code promised big returns. The foundation of the free software movement was largely driven to counter this proprietization trend.

The most commonly mentioned benefit of open-source software is cost. From the first glance, it may seem that open-source is a land flowing with milk and honey. One can use and build on top of the existing codebase for free, not being required to pay for any additional license. Of course, paying for a license is only part of the total cost, but let’s look at the other features of the open-source model to understand what makes it so interesting for the business community and public good AI applications.

The top 3 benefits of open source are:

  1. Crowdsourced reliability and security
  2. Freedom and flexibility
  3. Open innovation

Crowdsourced reliability and security

Sharing inspires excellence in development. If you know that your peers will be looking, you’ll do your best. Open-source software security is a concern for many; however, it’s proven that if the project’s code is made open, there will be more opportunities for ethical hackers to review that code and make it more secure. As a result, starting with high-quality components when it comes to the development of AI algorithms gives you a higher chance of producing better AI-driven products.

Freedom and flexibility

Using an open-source approach allows the AI algorithms to be built and customized from the ground up. It means that multiple software vendors can participate in the product development if needed. Unlike open-source, products based on proprietary code won’t allow you such flexibility and may lead to development bottlenecks and increased fees.

Open innovation

When you have people working on AI algorithms because they are intrinsically motivated to do so (not just because they’re getting paid), their personal drive to contribute with the best ideas is much higher. This drive is often what inspires volunteer communities to develop disruptive features and at a pace literally unimaginable to a commercial vendor.

Crowdsourcing approaches are known to boost innovation in resource-intensive industries with high competition, including AI-driven verticals. Now, what are the risks, and is open-source “enough”?

First, we need to remember that AI products based on the open-source development model carry the same set of risks as any open-source-based product: (1) Open-source products are vulnerable to malicious users because the privilege to access the codebase also gives the privilege to exploit the product’s vulnerabilities; (2) UX: not true for all open-source software, but some solutions might not be as user-friendly as their commercial counterparts; (3) Since open-source is typically developed by multiple contributors, responsibility is also scattered and it might be quite difficult to hold someone liable.

Second, the code component is only one element of the product. Today multiple AI algorithms and their code implementations are already open and even free to use. However, it’s not the code alone that makes AI product what it is. If someone wants to replicate the AI product based on its open-sourced algorithm, they will not be able to without extensive computing power and the datasets to train the AI model. In this case, just open-sourcing code does not necessarily solve the asymmetry between those who possess datasets and computational resources, and those who don’t.  

Open-source is a beautiful culture, but is it futile for AI-driven products? No. It needs a new definition. For the economy to prosper and to bring benefits to the underprivileged, we need to make sure that the Open-source concept is redefined for the AI-driven world. In this world, we may benefit largely from openly documenting our practices both for the developers as well as for the consumers. The developer community can use and reuse the common knowledge and therefore “stand on the shoulders of giants.” The consumer community, following the best practices from other industries, should be able to access information about how these systems are built and how automated decisions are made. Open Access. Open Data. Open Source. Open Ethics.


Text by Nikita Lukianets

Picture adopted from MORAN on Unsplash