Ready4R (2025-02-09): A gRadual Introduction to Unit Testing
Where I try to reframe unit testing as something approachable.
Welcome to the Weekly Ready for R mailing list! If you need Ready for R course info, it's here. Past newsletters are available here.
Reminder: API workshop for Ukraine on 3/13
On March 13, I will be giving a short workshop on using the {httr2}
package for requesting data with APIs. This is a benefit to support Ukrainians. You can participate by donating 20 euro or more at the link below.
Do the words “Web API” sound intimidating to you? This talk is a gentle introduction to what Web APIs are and how to get data out of them using the {httr2}, {jsonlite}. and {tidyjson} packages. You'll learn how to request data from an endpoint and get the data out. We'll do this using an API that gives us facts about cats. By the end of this talk, web APIs will seem much less intimidating and you will be empowered to access data from them.
More information: https://sites.google.com/view/dariia-mykhailyshyna/main/r-workshops-for-ukraine?authuser=0#h.hngu50v1j9mb
Concepts of Unit Testing
I thought I'd write a gentle introduction to Unit Testing. It can seem very daunting and I know a lot of us feel guilty for not testing it. Hadley Wickham / Jenny Bryan / et al. talk about the motivations for unit testing better than I can.
I want to change your mindset from "I should unit test" to "I can unit test, and it will make my life easier".
We all do some informal testing of functions when we're writing them. For example, if I wrote a function called square
:
square = function(x){
x * x
}
And I'd probably check my square
function with some fake data like this to make sure it's doing what I think it should do:
square(c(1,2,3))
So how do we take these informal tests and add them to a testing framework such as testthat
?
The heart of testthat
are expectations:
what do you expect will happen when a user uses an input? What about weird inputs? Is it going to throw an error, is it going to return a message?
For example, what would happen if we used a character
vector as an input? Our expectation in this case would be that the function will return an error:
So, for our informal test:
square(c("A", "B", "C"))
We can write this as an expectation using expect_error()
:
expect_error(square(c("A", "B", "C")))
We've just written a unit test! At a simple level, what {testthat}
does is bundle these expectations into larger sets of tests and it gives you a dashboard of tests that fail or pass each time you run it.
So, we will include our expect_error()
expectation in the larger unit test:
test_that("square give errors", {
expect_error(square(c("A", "B", "C"))), #try with a vector
expect_equal(is.na(square(NA)), TRUE) #and with a single value
})
Along with other expectations. The second unit test checks to see if an NA
is returned.
If we run this code, we will get nothing, because we expected an error in our test, and the function does what we expected. That means the test has passed. This is a good thing. It means that we did not break the code with our new modifications. It's only when the test fails that we need to pay attention.
What we call a testing suite is a set of test_that()
statements that we can run every time we modify our code. This is especially important when we refactor code, where we rewrite functions to make them more easily maintainable or to improve performance.
I haven't talked about the actual {testthat}
framework we need to initialize in our package directory. I'll leave that for next time.
Never Abandon the Principles of Machine Learning
Many of you have heard about DeepSeek and the 14x efficiency gain in training their Large Language Model (although their LLM appears to take more energy when you ask it questions, also known as inference).
Part of why their approach works is that they went back to first principles of Machine Learning; instead of doing bulk learning, they chose a reinforcement learning strategy, and only activate relevant portions of the network for certain topics.
There's something to be said about being scrappy - not having unlimited resources forces you to try new things. I think a lot of my favorite scientists have this scrappiness to their work.
Thank You!
Thanks for reading this far. As you can see, I try to cover a variety of topics, both technical and social. Thank you to j.bryk
for requesting that I write about Unit Testing. I'll continue on writing about this topic for now.
If there's a topic you want me to write about, don't hesitate to leave a comment or reply to this email.
Best, Ted