Guidelines for writing good R code
These guidelines are recommendations and are not meant to be obligatory. Many of the principles are useful and help working and collaborating more efficiently with R. Feel free to add your recommendations or remarks in the discussion section below.
General rules
- Use a proper editor (e.g. RStudio)
Style
- Hold yourself to the style guide (e.g. Style guide · Advanced R)
- Do not put in too many blank lines, if you want to separate code chunks. One is usually enough, maybe two if a big new Chapter begins
- Create good headlines for different sections in your R-Code. If you put trailing dashes (-), equal signs (=), or pound signs (#) at the end of the headlines you can collapse the whole section in RStudio (see here)
Principles
- Write good documentation:
- For external persons the code should be readable and understandable
- You can write documentation in the code
- External documentation on how to use the code is also helpful
- Write readable code:
- Sometimes it is better to write a for-Loop than to use the “apply”-function if the code is more readable (and runtime is not an important factor) → short code is not always automatically better
- Use proper indentation (there is a button for automatic indentation in RStudio)
- Use names instead of numbers for accessing columns in data
- Do not repeat any code (DRY principle):
- Write functions
- Write “for” loops
- Write R-packages for functions that you use in many places
- Use packages such as devtools and roxygen2 for easy creation and handling of R-packages
- When prototyping you might harm this guideline but while code review you should think about it
- Do not write too long code
- Put things (code, functions) into other files and “source” them (e.g. if you always use the same functions/libraries at the beginning)
- Use a proper file system, where you can put files into subfolders
- This can be supported by using “projects” in RStudio
- Do not write R-Code that takes too long
- Are there faster functions for doing the task?
- Can you rewrite the code (e.g. with apply, data.table)?
- Are you really only doing things that are necessary?
- Is it possible to run it on several CPU-cores?
- Can you put the slow code in C++ to make it faster? (e.g. with Rcpp)
- Remove commented code unless you are sure you want to use it later. It makes code reading much harder and blows your code up. Put yourself a deadline date, from which on you will remove the commented code
- Write some tests (if necessary) to check your data:
- Use stop(), stopifnot() or warning()
Version Control System
Use version control systems (VCS) such as git (or SVN) in combination with e.g. github/gitlab if it is a bigger project.
- Tasks of VCS:
- Logging of changes: It can be traced at any time who has changed what and when.
- Restoration of old code states
- Archiving of the individual states
- Code review (for oneself, e.g. at the end of the day when committing)
- Advantages regarding collaboration:
- Coordination of the joint access of several developers to the files
- Simultaneous development of several development branches
- Code Review for others
- Project Management is possible on github/gitlab
References
- A very good general book for intermediate R-Users: Advanced R (Hadley Wickham)
- Some guidelines regarding coding and typography: Guidelines for Statistical Projects: Coding and Typography (Marius Hofert, Ulf Schepsmeier)
- A book about how to write good Shiny Apps: Mastering Shiny (Hadley Wickham)
Written on February 2, 2022
comments powered by Disqus