PHP is a very accessible language, and it makes sense that casual / younger / lower paid programmers can easily contribute.

Ben Podgursky, a Software Engineer at Liveramp, a few weeks ago, described how he used Git commit metadata plus the Rapleaf API to build aggregate demographic profiles for popular GitHub organizations.



Ben Podgursky:  I was also interested in slicing the data somewhat differently, breaking down demographics per programming language instead of per organization.  Stereotypes about developers of various languages abound, but I was curious how these lined up with reality.  The easiest place to start was age, income, and gender breakdowns per language. Given the data I’d already collected, this wasn’t too challenging:

  • For each repository I used GitHub’s estimate of a repostory’s language composition.  For example, GitHub estimates this project at 75% Java.
  • For each language, I aggregated incomes for all developers who have contributed to a project which is at least 50% that language (by the above measure).
  • I filtered for languages with > 100 available income data points.

NOTE: DaVinci Coders is a Boulder, Colorado-based school for prospective coders, teaching Ruby on Rails. According to Thomas Frey, Founder of DaVinci Coders, “The demand for Ruby of Rails experts is very strong, and a great way to enter this increasingly lucrative profession.”

Here are the results for income, sorted from lowest average household income to highest:

Language Average Household Income ($) Data Points
  • Puppet
87,589.29 112
  • Haskell
89,973.82 191
  • PHP
94,031.19 978
  • CoffeeScript
94,890.80 435
  • VimL
94,967.11 532
  • Shell
96,930.54 979
  • Lua
96,930.69 101
  • Erlang
97,306.55 168
  • Clojure
97,500.00 269
  • Python
97,578.87 2314
  • JavaScript
97,598.75 3443
  • Emacs Lisp
97,774.65 355
  • C#
97,823.31 665
  • Ruby
98,238.74 3242
  • C++
99,147.93 845
  • CSS
99,881.40 527
  • Perl
100,295.45 990
  • C
100,766.51 2120
  • Go
101,158.01 231
  • Scala
101,460.91 243
  • ColdFusion
101,536.70 109
  • Objective-C
101,801.60 562
  • Groovy
102,650.86 116
  • Java
103,179.39 1402
  • XSLT
106,199.19 123
  • ActionScript
108,119.47 113

Here’s the same data in chart form:

Most of the language rankings were roughly in line with my expectations, to the extent I had any:

  • Haskell is a very academic language, and academia is not known for generous salaries
  • PHP is a very accessible language, and it makes sense that casual / younger / lower paid programmers can easily contribute
  • On the high end of the spectrum, Java and ActionScript are used heavily in enterprise software, and enterprise software is certainly known to pay well

On the other hand, I’m unfamiliar with some of the other languages on the high/low ends like XSLT, Puppet, and CoffeeScript.  Any ideas on why these languages ranked higher or lower than average?

Caveats before making too many conclusions from the data here:

  • These are all open-source projects, which may not accurately represent compensation among closed-source developers
  • Rapleaf data does not have total income coverage, and the sample may be biased
  • I have not corrected for any other skew (age, gender, etc)
  • I haven’t crawled all repositories on GitHub, so the users for whom I have data may not be a representative sample

That said, even though the absolute numbers may be biased, I think this is a good starting point when comparing relative compensation between languages.

Let me know any thoughts or suggestions about the methodology or the results.  I’ll follow up soon with age and gender breakdowns per language in a similar fashion.

Photo credit: Tommy Ready

Via bpodgursky