The penumbra of open source: projects outside of centralized platforms are longer maintained, more academic and more collaborative

Milo Z. Trujillo, Laurent Hébert-Dufresne & James Bagrow 

EPJ Data Science volume 11, Article number: 31 (2022)

GitHub has become the central online platform for much of open source, hosting most open source code repositories. With this popularity, the public digital traces of GitHub are now a valuable means to study teamwork and collaboration. In many ways, however, GitHub is a convenience sample, and may not be representative of open source development off the platform. Here we develop a novel, extensive sample of public open source project repositories outside of centralized platforms. We characterized these projects along a number of dimensions, and compare to a time-matched sample of corresponding GitHub projects. Our sample projects tend to have more collaborators, are maintained for longer periods, and tend to be more focused on academic and scientific problems.

