Site background texture

R vs Python: Why are we still talking about this?

7 March 2023

R vs Python: Why are we still talking about this?

I was prompted to write this piece after engaging in Zach Wilson’s post on LinkedIn talking about the age-old R vs Python debate. I’m honestly not really sure why we’re still talking about this?

Data as an industry has gone through an explosive overhaul in the past decade, driven in part by the normalisation of mass data collection techniques and the ultra-famous, ultra-successful adopters of edge technologies that are now grinding their way into normalised, everyday use.

A side effect of this spectacular explosion has been the reframing of “data science”. In my experience, data scientists historically were essentially OP data nutjobs with a penchant for business acumen and communication. The sort of guys who had ludicrous free reign to dig and pursue more or less anything that seemed interesting. Today, we’ve rebranded the title to essentially mean ‘analysts who code’ — but more than that, we’re operating in a world where data scientist means different things to almost every company.

In these many technical spaces, there’s plenty of scope for organisations to explain why R or Python is their tool of choice, yet, in spite of the many comparisons, R and Python just simply occupy different technical spaces.

Python, more broadly, competes with other major programming languages like Java and C# (to whatever degree) with data as a small subset of its overall capability. R, on the other hand, perhaps muddies the water by having an IDE, code-driven interface, and a series of very impressive open-source contributions that extend it far beyond its original use case — but it is fundamentally a statistical software package, much more akin to SAS or SPSS than other major programming languages. R does analysis, and it’s great at it.

Now, the “analysts who code” distinction is an important one — because therein lies the logic for choosing one or the other.

Major tech firms, who already have a massive software engineering footprint might choose to work entirely in Python, since there’s inherent comfort working with Linux, or existing technical capability offering economies of scope and/or scale. Those firms can perhaps repurpose existing staff to complete data tasks — after all, there’s a good chance the data teams are embedded into the same technical function.

Then there’s the other guys. Companies who are not inherently technical and perhaps relying on a light and under resourced technical team to maintain their stack of operational software (CRMs, etc). It IS common for those businesses to have data or business analysts, and that is the market for R. Providing a platform for advanced analyses without heavy technical requirements. Allowing individuals to extend their capabilities without requiring heavier supporting technical knowledge.

Of course, some organisations might have a need for both, some might have neither, but ultimately, there’s no need for Python and R to be identified as competing solutions — they’re simply different, popular solutions for data applications and workloads.