Don't judge a project by its GitHub stars alone

Open source is now universally accepted and employed by developers and companies across the world.  This rise in popularity, though, has raised many questions about what exactly the new world of open source looks like.

  • What are the most popular open source languages?  

  • Which packages have had the greatest adoption?  

  • How many packages are actively being used?

As we started to ask questions like these, we realized we needed to simplify our questions a little bit.  When it comes down to it, do we even have ways of getting reliable answers to these questions?

Last week, I took a look at the Libraries.io dependent repositories count, alluding to its origins as a solution to questions about open source package usage.  Today, I’d like to dive a little deeper into why this metric is so important, and why it’s hard to judge actual usage with other common stats.  

GitHub_Stars_photo.jpg

The need for decaying usage metrics

Three predominant means of assessing the popularity of an open source project are GitHub stars, GitHub forks, and package downloads. The most common of these is the number of GitHub stars.

Millions of packages have been starred on GitHub, but this doesn’t quite help us understand overall usage.  For example, a user can star a package for many reasons: some do it as a reminder to come back later (like a bookmark), some treat it as a Facebook Like, and others may do it to curate a list of their favorite packages.  If anything, stars do the best job showing us the amount of traffic or attention a package receives on GitHub.  But none of this tells us if the package is actually being used.

What’s more, GitHub stars also have a nondecreasing problem.  By that I mean that the number of stars a package or repo has will only effectively only ever increase—or stagnate.  

In theory, a GitHub user could un-star a package, but that seldom happens in practice.  This causes stars to represent the community’s feeling at some point in time (specifically the past) more than current preferences.

Another commonly accepted measure for popularity is the number of forks a package has. The thinking is that a developer must be interested in the work if they create their own local copy of the code—assuming they even end up using the local copy.

Yet forks have a similar nondecreasing problem to stars (and also downloads!) in that they never decrease.  Because of this, a package that was once super-popular but has fallen by the wayside will still appear popular by each of these counts.  Furthermore, most open source software is consumed through ecosystem-specific package managers, not directly via GitHub forks.

This means that, in reality, stars, forks, and downloads act in similar ways: they show a nondecreasing metric of popularity that fails to account for negative changes in the community’s tastes.

What’s more, because these metrics do not decrease, they open themselves up to gameability and manipulation, whether through bots or various best practice techniques.

Understanding what stars miss

It was with all of this in mind that Libraries.io created their dependent repositories count: to understand which packages are most actively being used, and which are gaining or losing favor.  But how different are the lists of top packages by dependent repositories and GitHub stars?  And what are the kinds of packages that tend to be over or underrepresented by a metric such as stars?

To attempt to understand this relationship, I took the top ~2,500 GitHub packages ranked by their stars count and joined them with their dependent repositories count. Some top repos were omitted because they aren’t packages to build on top of (for example, the most starred repository on GitHub is freeCodeCamp—containing the codebase and curriculum of freeCodeCamp—which is a wonderful service, but not something most developers use on a daily basis).

Additionally, it can be difficult to compare dependent repositories across different ecosystems—JavaScript and NPM packages tend to have far more connections than, say, Rust and Cargo packages. So I compared the number of dependent repositories in a given package’s language ecosystem against the number of stars that package has, then divided by the total number of packages in that language.  

This resulted in the final two metrics being dependent repositories per package versus stars per package in each language.  Through this comparison, we can find the most over and underappreciated packages—regardless of language—by looking at the difference between their ranking by stars per package and dependent repositories per package:

Packages misrepresented by stars

Rank Underappreciated by stars Overappreciated by stars
1 gruntjs/grunt-contrib-uglify/ minimaxir/big-list-of-naughty-strings
2 isaacs/minimatch twbs/ratchet
3 hapijs/boom simple-icons/simple-icons
4 dankogai/js-base64 localstack/localstack
5 nodeca/pako mattt/Surge
6 ashtuchkin/iconv-lite ogham/exa
7 sindresorhus/gulp-imagemin primer/primer
8 hueniverse/hawk pNre/ExSwift
9 fb55/htmlparser2 redis/hiredis
10 estools/escodegen garnele007/SwiftOCR

Libraries.io has already done some work to understand which crucial packages are going unnoticed with their Unseen Infrastructure research project, and one conclusion that’s echoed in the table above is that the scope and usage of JavaScript has not been recognized aptly by GitHub stars: each of the ten most underappreciated projects is written in JavaScript.  This likely has to do with the modularity of the JavaScript—and specifically NPM—ecosystem.

As for the most over-appreciated packages by stars?  Well, the overwhelming commonality is that many of these have seen little to no development or new releases in the past two years (hiredis, big-list-of-naughty-strings, Surge, ratchet, ExSwift).

This speaks to exactly why dependent repositories count is an important metric: these packages that were once popular have seen little activity and are thus little used today.  

Looking at their stars count would lead you to believe that these packages are still popular, and perhaps actively maintained and updated, which presents a risk to any potential user.  It’s these forms of risks that make decaying attention metrics crucial to understanding what’s going into your open source stack.

This is one of the aspects of open source we’re exploring in more detail here at Tidelift.  If you are interested in learning more, consider signing up for our mailing list or following us on Twitter.

Keenan Szulik