Chien-Lan Hsueh 2022-06-09

My R Programming Experience

In this blog, I will share my experience with R and my thoughts about R comparing to other programming languages I have came across.

Background and My Learning Experience

I have been using Fortran and C/C++ to solve equations and process data since my undergraduate years. They are powerful enough doing both algebraically and numerical calculations but not great in generating a nice report and sharing works with other collaborators. This forces me to use some commercial software like Mathematica, Matlab and Maple. Very often I found I have to use multiple tools (both programming languages and GUI software) to complete a task and spend a lot of time to make them work seamlessly.

When working in the Silicon Valley, I started to use JMP and Excel with VBA because these two are the most commonly used in most process and material R&D department. JMP Scripting Language and spreadsheets with VBA macro are easy to pick up for most of the engineers and scientists. The work can be done in a reproducible way and the reports can be generated in a nice looking PowerPoint and Word format. The only drawback is they are not capable with big-sized data. Also a JMP license is expensive and it is not practical to expect all of your customers and vendors have it. Additionally, those basic statistic tools and my statistics knowledge are not sufficient to do my jobs.

Why I like R

That’s when I started to alternative solutions and R came into my life. Not only it is open source but also it has a huge community. There are tons of quality packages developed by professionals and researchers in all different fields. The best thing is, most of time, they are well documented. As a SSBB with OCD, you cannot image how important this is to me!!!

On the way I learn it and gradually rewrite my toolkit in R, I discover more good things about R. First, I like the consistency of the language as well as the packages people developed. To compute an average of some data, you have the same interface to call mean() even the data is stored in various objects. In some programming languages, sometime you use a function call and sometimes you use a class method.

Second, I enjoy the vectorization and piping. This not only make codes clean and neat but also increase computing efficiency. BTW, starting from R 4.1, it supports a native pipe operator and you might want to check it out.

Third, the wonder of tidyverse. Marvel is not the only one knowing the secret tip to a huge success by creating an universe.

Fourth but not the least, a data frame is a built-in data type. There are many statistic tools are built-in or provided in base packages. You can complete EDA, testings and modeling tasks right out of a box.

Compared to Other Programming Languages and My Wish List for R

Now it’s time to say something about the disadvantages of using R. First, R is an interpreted computer language and be slow compared to other programming languages like C/C++.

Second, its learning curve is steeper too. It looks intimidate to many of the young colleagues I work with. This makes it difficult to share my works easily but nowadays there are workaround solutions including interactive R Markdown notebooks and Shiny dashboards. There are R GUI available too. Some of them are actively being developed and updated including Rcmdr: R Commander, RKWard, jamovi, JASP, Rattle and BlueSky Statistics.

Third, grass is greener in the other side. In the last decade, there are new ML algorithms being developed everyday and many of them are written in other popular programming language like Python. Quite often, at the beginning, there is little or none supports in R. Their community is very creative on inventing novel tools and applications too. For example, Dr. Grant Sanderson (aka 3Blue1Brown) has created Manim engine, originally designed for making educational math videos, to produce animations using programming codes. This makes the process of making a video more precisely, reusable and reproducible.

Fourth, related to the example above, R is not a general purpose language. It would be nice if we can do more beyond statistics and data analysis, but maybe we shouldn’t be so greedy. After all, that is its strength and what it is designed for.

If I can wish for one thing to be added into the current R, I would propose to have list comprehension like Python has. This together with R’s strengths like vectorization and piping can really empower R to next level in the era of big data and machine learning.

Example R Markdown Output

plot(iris) 


<
Previous Post
Build a Blog site with GitHub Page
>
Next Post
Some Thoughts Aboout Using API