Chapter 14 Secure your work

14.1 Using Version Control with git

“Friends don’t let friends work on a coding project without version control.” You might have heard this before, without really considering what this means. Or maybe you’re convinced about this saying, but haven’t had the opportunity to use git, GitHub or Gitlab for versioning your applications. If so, now is the time for a workflow change!

14.1.1 Why Version Control?

Have you ever experience a piece of code disappearing? Or the unsolvable problem of integrating changes when several people have been working on the same piece of code? Or the inability to find back something you’ve written a while back?

If so, you might have been craving for Version Control (now shortened VC). In this chapter, we’ll be focusing on git, but you should be aware that other VC system exist, but they are less popular than git. Git was designed to handle collaboration on code projects20 where potentially a lot of people have to interact and make changes to the codebase. Git might feel a little bit daunting at first, and even seasoned developers still misuse it, or don’t understand it completely. But don’t give up: the benefits from learning it really outweigh the (apparent) complexity.

There are many advantages to VC, and here is a non-exhaustive list:

  • You can get back in time. With a VC system like git, every change is recorded (well, every committed change), meaning that you can potentially get back in time to a previous version of a project. Crucial if you accidentally made changes that break your application, or if you deleted a feature you thought you’d never need.

  • Several people can work on the same file. Git relies on a system of branches. Within this branch pattern, there is one main branch, called “Master”, which contains the stable, main version of the code-base. By “forking” this branch (or any other branch actually), developers will have a copy of the base branch, where they can safely work on changing (and breaking) things, without impacting the origin branch. This allows to try things in a safe environment, without touching what is working.

  • You can safely track changes. Every time a developer records something to git, changes are listed. In other words, you can see what changes are made to a specific file in your code base.

  • It centralizes the code base. You can use git locally, but its strength also relies on the ability to synchronize your local project with one on a server. This also means that several people can synchronize with this server and collaborate on a project. That way, changes on a branch on a server can be downloaded (it’s called pull in git terminology) by all the members of the team.

14.1.2 Git basics: commit - push - pull

These are the three main actions you’ll be performing in git, and if you just need to learn the minimum to get started, they are the three essential ones. commit

A commit is a photography of a codebase at a given moment in time. Every commits is associated with two things: a sha1, which is a unique reference in the history and that allows you to identify this precise state when you need to get back in time, and a message, which is a piece of text that describes the commit21. Note that message are mandatory: you can’t commit without them. Don’t overlook these messages: they might seem like a constraint at first but they are a life save when you need to understand the history of a project.

There is no strict rule about what and when to commit. Just keep in mind that commits are what allow you to get back in time, so a commit is a complete state of your code base to which it would make sense to get back to. push

Once you’ve get a set of commits ready, you’re ready to push it to the server. In other word, you’ll permanently record these commits (so the series of changes) to the server.

Making a push implies two things:

  • Other people in the team will be able to retrieve the changes you’ve made

  • These changes will be recorded permanently in the project history22. pull

Once changes have be recorded in the main server, everybody synchronized with the project can pull the commits to their local project.

14.1.3 About branches

Branches are git way to organize work and ideas, and notably when several people are collaborating on the same project (which might be the case when building large web applications with R).

How does it work? When your start a project, you’re in the main branch, which is called the “Master”. In a perfect world, you never work directly on this branch: it should always contain a working, deployable version of the application. One

Other branches are to be thought as work areas, where developers fix bugs or add features. The modifications made in these dev branches will then be transferred (directly or indirectly) to the master branch.

// TODO image about branching

In practice, you might want to use a workflow where each branch is designed to fix a small issue or implement a feature, so that it’s easier to separate each small part of the work.

14.1.4 Issues

If you are working with a remote tool like GitHub or GitLab, there’s a good change you’ll be using issues. Issues are “notes” that can be used to track a bug or to suggest a feature. This tools is crucial when it comes to project management: they are the perfect spot for organizing and discussing ideas, but also to have an overview of what has been done, what is currently done and what’s left to be done. Issue can also be used as a discussion medium with beta testers, clients or sponsors.

One other valuable feature of issues is that they can be referenced inside commits. In other words, when you send code to the centralized server, you can link this code to one or more issues.

14.2 Git integration

14.2.1 With RStudio

Git is very well integrated to the Rstudio IDE, and while working on your app using git can be as simple as clicking on a button from time to time. If you are using RStudio, you’ll find a pull/push button, a stage & commit interface, a tool for visualizing diff in files. Everything you need to get started is there.

14.2.2 As part of a larger world

Git is not reserved for team work: even if you are working alone on a project, using git is definitely worth the effort. Using git, and particularly issues, helps you organize your train of thoughts, especially upfront when you need to plan what you’ll be doing.

And of course, remember that git isn’t reserved to Shiny Applications: it can be used for any other R related projects, and at the end of the day for any code related projects, making it a valuable skill to have in your toolbox, whatever language you’ll be working with in 10 years!

14.2.3 About git-flow


14.2.4 Further readings on git

Git can be used in different ways and different approaches exist. The comprehensiveness of the different possible approaches is beyond the scope of this book, and other resources exist as well. We invite you to follow these different links:

14.3 CI and testing

Testing is central for making your application survive in the long run. The {testthat} package can be used to test the “business logic” side of your app, while the application features can be tested with packages like {shinytest}, or software like Katalon.

// TO DO: more info about the tools + link to resources.

  1. It was first developed by Linus Torvalds, the very same man behind Linux

  2. For example: “Added a graph in the analysis tab”, or “Fixed the docx export bug”

  3. Well, it’s technically possible to erase things from the server history, but it’s dangerous to do so and generally accepted as a bad practice.

ThinkR Website