Top 50 R Interview Questions and Answers

  • date 19th November, 2019 |
  • by Prwatech |

Top 50 R Interview Questions and Answers

How can you load a .csv file in R?

Loading a .csv file in R is quite easy.
All you need to do is use the “read.csv()” function and specify the path of the file.



What are the different components of the grammar of graphics?

1. Data layer
2.Aesthetics layer
3. Geometry layer
4. Facet layer
5. Coordinate layer
6. Themes layer


What is Rmarkdown? What is the use of it?

RMarkdown is a reporting tool provided by R. With the help of Rmarkdown, you can create high-quality reports of your R code.
The output format of Rmarkdown can be:

2. PDF

R-Programming Tutorials



Name some packages in R, which can be used for data imputation?

2. Amelia
3. missForest
4. Hmisc
5. Mi
6. imputeR
7. Name some functions available in “dplyr” package.
8. filter
9. select
10 .mutate
11. arrange
12. count


R – Variable



Tell me something about shinyR?

Ans) Shiny is an R package that makes it easy to build interactive web apps straight from R. You can host standalone apps on a webpage or embed them in Rmarkdown documents or build dashboards. You can also extend your Shiny apps with CSS themes, htmlwidgets, and JavaScript actions.


What packages are used for data mining in R?

Some packages used for data mining in R:

1. data.table- provides a fast reading of large files
2. rpart and caret- for machine learning models.
3. Arules- for association rule learning.
4. GGplot- provides various data visualization plots.
5. tm- to perform text mining.
6. Forecast- provides functions for time series analysis


R – Bar Charts



What do you know about the rattle package in R?

Answer)Rattle is a popular GUI for data mining using R. It presents statistical and visual summaries of data, transforms data so that it can be readily modeled, builds both unsupervised and supervised machine learning models from the data, presents the performance of models graphically, and scores new datasets for deployment into production. A key feature is that all of your interactions through the graphical user interface are captured as an R script that can be readily executed in R independently of the Rattle interface.


Name some functions which can be used for debugging in R?


1. traceback()
2. debug()
3. browser()
4. trace()
5. recover()


R – Importing data from tab delim



What is R?

Answer) This should be an easy one for data science job applicants. R is an open-source language and environment for statistical computing and analysis, or for our purposes, data science.


Can you write and explain some of the most common syntaxes in R?

Answer) Again, this is an easy—but crucial—one to nail. For the most part, this can be demonstrated through any other code you might write for other R interview questions, but sometimes this is asked as a standalone. Some of the basic syntax for R that’s used most often might include:
# — as in many other languages, # can be used to introduce a line of comments. This tells the compiler not to process the line, so it can be used to make code more readable by reminding future inspectors what blocks of code are intended to do.
“” — quotes operate as one might expect; they denote a string data type in R.
<- — one of the quirks of R, the assignment operator is <- rather than the relatively more familiar use of =. This is an essential thing for those using R to know, so it would be good to display your knowledge of it if the question comes up.
\ — the backslash, or reverse virgule, is the escape character in R. An escape character is used to “escape” (or ignore) the special meaning of certain characters in R and, instead, treat them literally.


R – Importing data from tab delim



What are some advantages of R?

Answer) It’s important to be familiar with the advantages and disadvantages of certain languages and ecosystems. R is no exception.


what are the advantages of R?

Its open-source nature. This qualifies as both an advantage and disadvantage for various reasons, but being open source means it’s widely accessible, free to use, and extensible.
Its package ecosystem. The built-in functionality available via R packages means you don’t have to spend a ton of time reinventing the wheel as a data scientist.
Its graphical and statistical aptitude. By many people’s accounts, R’s graphing capabilities are unmatched.


R – Importing data from tab delim



What are the disadvantages of R?

Answer) Just as you should know what R does well, you should understand its failings.
Memory and performance.
In comparison to Python, R is often said to be the lesser language in terms of memory and performance.
This is disputable, and many think it’s no longer relevant as 64-bit systems dominate the marketplace.

Related: Our list of Python Interview Questions and Answers

Open-source. Being open-source has its disadvantages as well as its advantages. For one, there’s no governing body managing R, so there’s no single source for support or quality control. This also means that sometimes the packages developed for R are not the highest quality.
Security. R was not built with security in mind, so it must rely on external resources to mind these gaps.





Write code to accomplish a task?

Answer) In just about an interview for a position that involves coding, companies will ask you to accomplish a specific task by actually writing code. Facebook and Google both do as much. Because it’s difficult to predict what task an interviewer will set you to, just be prepared to write “whiteboard code” on the fly


What are the different data types/objects in R?

Answer) This is another good opportunity to show that you know R, and you’re not winging it. Unlike other object-oriented languages such as C, R doesn’t ask users to declare a data type when assigning a variable. Instead, everything in R correlates to an R data object. When you assign a variable in R, you assign it a data object and that object’s data type determines the data type of the variable. The most commonly used data objects include:

1. Vectors
2. Matrices
3. Lists
4. Arrays
5. Factors
6. Data frames


R – Dataframe



What are the objects you use most frequently?

Answer) This question is meant to gather a sense of your experiences in R. Simply think about some recent work you’ve done in R and explain the data objects you use most often. If you use arrays frequently, explain why and how you’ve used them.


Why use R?

Answer) This is a variant of the “advantages of R” question. Reasons to use R include its open-source nature and the fact that it’s a versatile tool for statistical plotting, analysis, and portrayal. Don’t be afraid to give some personal reasons as well. Maybe you simply love the assignment operator in R or feel that it’s more elegant than other languages—but always remember to explicate. You should be answering follow-up questions before they’re even asked.


R – pie charts



What are some of your favorite functions in R?

Answer) As a user of R, you should be able to come up with some functions on the spot and describe them. Functions that save time and, as a result, the money will always be something an interviewer likes to hear about.


What is a factor variable, and why would you use one?

Answer) A factor variable is a form of the categorical variable that accepts either numeric or character string values. The most salient reason to use a factor variable is that it can be used in statistical modeling with great accuracy. Another reason is that they are more memory efficient.
Simply use the factor() function to create a factor variable


R – Scatterplots



Which data object in R is used to store and process categorical data?

Answer) The Factor data objects in R are used to store and process categorical data in R.


How do you get the name of the current working directory in R?

Answer) The command getwd() gives the current working directory in the R environment.

What makes a valid variable name in R?

Answer) A valid variable name consists of letters, numbers and the dot or underline characters. The variable name starts with a letter or the dot not followed by a number.


R – Boxplots



What is the main difference between an Array and a matrix?

Answer) A matrix is always two dimensional as it has only rows and columns. But an array can be of any number of dimensions and each dimension is a matrix. For example, a 3x3x2 array represents 2 matrices each of dimension 3×3.


Which data object in R is used to store and process categorical data?

Answer) The Factor data objects in R are used to store and process categorical data in R


What is the recycling of elements in a vector? Give an example.

Answer) When two vectors of different lengths are involved in operation then the elements of the shorter vector are reused to complete the operation. This is called element recycling. Example – v1 <- c(4,1,0,6) and V2 <- c(2,4) then v1*v2 gives (8,4,0,24). The elements 2 and 4 are repeated


R – Package



What is a lazy function evaluation in R?

Answer) The lazy evaluation of a function means, the argument is evaluated only if it is used inside the body of the function. If there is no reference to the argument in the body of the function then it is simply ignored.


Name R packages that are used to read XML files?

Answer) The package named “XML” is used to read and process the XML files.


Can we update and delete any of the elements in a list?

Answer) The general expression to create a matrix in R is – matrix(data, nrow, ncol, byrow, dimnames)


R – Operators



What is the reshaping of data in R?

Answer) In R the data objects can be converted from one form to another. For example, we can create a data frame by merging many lists. This involves a series of R commands to bring the data into the new format. This is called data reshaping.


What does unlist() do?

Answer) It converts a list to a vector.


How do you convert the data in a JSON file to a data frame?

Answer) Using the function


What is the use of apply() in R?

Answer) It is used to apply the same function to each of the elements in an Array. For example, finding the mean of the rows in every row.


R – Lists



How to find the help page on missing values?

Answer) ?NA

How do you get the standard deviation for a vector x?

Answer) sd(x, na.rm=TRUE)


How do you set the path for the current working directory in R?

Answer) setwd(“Path”)


What is the difference between “%%” and “%/%”?

Answer) “%%” gives the remainder of the division of the first vector with second while “%/%” gives the quotient of the division of the first vector with the second.


What does col.max(x) do?

Answer) Find the column has the maximum value for each row.


Give the command to create a histogram.

Answer) hist()


How do you remove a vector from the R workspace?

Answer) rm(x)


List the data sets available in package “MASS”

Answer) data(package = “MASS”)


List the data sets available in all available packages.

Answer) data(package = .packages(all.available = TRUE))


R – Data structure



What is the use of the command – install.packages(file.choose(), repos=NULL)?

Ans) It is used to install an r package from a local directory by browsing and selecting the file.


What is the use of the “next” statement in R?

Ans) The “next” statement in R programming language is useful when we want to skip the current iteration of a loop without terminating it.
Two vectors X and Y are defined as follows – X <- c(3, 2, 4) and Y <- c(1, 2).


What will be the output of vector Z that is defined as Z <- X*Y.

Ans) In R language when the vectors have different lengths, the multiplication begins with the smaller vector and continues till all the elements in the larger vector have been multiplied.
The output of the above code will be –
Z <- (3, 4, 4)


R language has several packages for solving a particular problem. How do you make a decision on which one is the best to use?

Answer) The CRAN package ecosystem has more than 6000 packages. The best way for beginners to answer this question is to mention that they would look for a package that follows good software development principles. The next thing would be to look for user reviews and find out if other data scientists or analysts have been able to solve a similar problem.


Explain the significance of transpose in R language

Answer) Transpose t () is the easiest method for reshaping the data before analysis.


What are with () and BY () functions used for?

Answer) With () function is used to apply an expression for a given dataset and BY () function is used for applying a function each level of factors.
dplyr package is used to speed up the data frame management code. Which package can be integrated with dplyr for large fast tables?
Answer) data.table

Quick Support

image image