If you'd like to follow along, pleae make sure you have the following packages installed
install.packages(c("tidyverse", "esvis", "devtools", "roxygen2", "usethis"))
What is an R function?
+
and <-
What is an R function?
+
and <-
What are the components of a function?
What is an R function?
+
and <-
What are the components of a function?
You should consider writing a function whenever you’ve copied and pasted a block of code more than twice (i.e. you now have three copies of the same code).
Once you've written more than one function, you may want to bundle them. There are two general ways to do this:
Once you've written more than one function, you may want to bundle them. There are two general ways to do this:
source?
Write a package
Once you've written more than one function, you may want to bundle them. There are two general ways to do this:
source?
Write a package
source
ingBundling functions into a package is not that hard!
Standardized mean differences
Standardized mean differences
Standardized mean differences
Assumes reasonably normally distributed distributions (mean is a good indicator of central tendency)
Differences in means may not reflect differences at all points in scale if variances are different
Standardized mean differences
Assumes reasonably normally distributed distributions (mean is a good indicator of central tendency)
Differences in means may not reflect differences at all points in scale if variances are different
Substantive interest may also lie with differences at other points in the distribution.
library(tidyverse)common_var <- tibble(low = rnorm(1000, 10, 1), high = rnorm(1000, 12, 1), var = "common")diff_var <- tibble(low = rnorm(1000, 10, 1), high = rnorm(1000, 12, 2), var = "diff")d <- bind_rows(common_var, diff_var)head(d)
## # A tibble: 6 x 3## low high var ## <dbl> <dbl> <chr> ## 1 10.4 11.4 common## 2 9.48 10.7 common## 3 11.7 10.4 common## 4 8.97 11.0 common## 5 9.96 12.1 common## 6 8.76 12.1 common
d <- d %>% gather(group, value, -var) d
## # A tibble: 4,000 x 3## var group value## <chr> <chr> <dbl>## 1 common low 10.4 ## 2 common low 9.48## 3 common low 11.7 ## 4 common low 8.97## 5 common low 9.96## 6 common low 8.76## 7 common low 10.1 ## 8 common low 11.1 ## 9 common low 11.9 ## 10 common low 9.50## # ... with 3,990 more rows
theme_set(theme_minimal())ggplot(d, aes(value, color = group)) + geom_density(lwd = 1.5) + facet_wrap(~var)
n
bins (based on percentiles)d[i]=¯Xfoc[i]−¯Xref[i]√(nfoc−1)Varfoc+(nref−1)Varrefnfoc+nref−2
n
bins (based on percentiles)d[i]=¯Xfoc[i]−¯Xref[i]√(nfoc−1)Varfoc+(nref−1)Varrefnfoc+nref−2
common <- filter(d, var == "common")diff <- filter(d, var == "diff")
library(esvis)qtile_es(value ~ group, common)
## ref_group foc_group low_qtile high_qtile midpoint es se## 1 high low 0.00 0.33 0.165 -2.060092 0.09645691## 2 high low 0.33 0.66 0.495 -2.072788 0.09651680## 3 high low 0.66 0.99 0.825 -2.044473 0.09605817
qtile_es(value ~ group, diff)
## ref_group foc_group low_qtile high_qtile midpoint es se## 1 high low 0.00 0.33 0.165 -0.6429559 0.07995721## 2 high low 0.33 0.66 0.495 -1.3213209 0.08592584## 3 high low 0.66 0.99 0.825 -1.9278210 0.09421322
binned_plot(value ~ group, common)
binned_plot(value ~ group, diff)
Use tools throughout (which we'll talk about momentarily) to help automate many of the steps, and make the whole thing less painful
1a) check that no one had the same idea 😇
— Maëlle Salmon 🐟 (@ma_salmon) April 10, 2018
And some further recommendations/good advice
We surely won't get through all the steps tonight. In my mind, the best resources are:
We surely won't get through all the steps tonight. In my mind, the best resources are:
For a really quick but really good intro, see Hilary Parker's blog post
We're going to write a package today! Let's keep it really simple...
x
: n
, mean
, and sd
. Let's also have it report on the number of missing observations. describe <- function(x) { n <- as.integer(length(na.omit(x))) nmiss <- as.integer(sum(is.na(x))) mn <- mean(x, na.rm = TRUE) stdev <- sd(x, na.rm = TRUE) out <- tibble::tibble(n_valid = n, n_missing = nmiss, mean = mn, sd = stdev) out}
describe <- function(x) { n <- as.integer(length(na.omit(x))) # Count number of valid cases nmiss <- as.integer(sum(is.na(x))) mn <- mean(x, na.rm = TRUE) stdev <- sd(x, na.rm = TRUE) out <- tibble::tibble(n_valid = n, n_missing = nmiss, mean = mn, sd = stdev) out}
describe <- function(x) { n <- as.integer(length(na.omit(x))) nmiss <- as.integer(sum(is.na(x))) # Count the number of missing mn <- mean(x, na.rm = TRUE) stdev <- sd(x, na.rm = TRUE) out <- tibble::tibble(n_valid = n, n_missing = nmiss, mean = mn, sd = stdev) out}
describe <- function(x) { n <- as.integer(length(na.omit(x))) nmiss <- as.integer(sum(is.na(x))) mn <- mean(x, na.rm = TRUE) # Calculate mean stdev <- sd(x, na.rm = TRUE) out <- tibble::tibble(n_valid = n, n_missing = nmiss, mean = mn, sd = stdev) out}
describe <- function(x) { n <- as.integer(length(na.omit(x))) nmiss <- as.integer(sum(is.na(x))) mn <- mean(x, na.rm = TRUE) stdev <- sd(x, na.rm = TRUE) # Standard deviation out <- tibble::tibble(n_valid = n, n_missing = nmiss, mean = mn, sd = stdev) out}
describe <- function(x) { n <- as.integer(length(na.omit(x))) nmiss <- as.integer(sum(is.na(x))) mn <- mean(x, na.rm = TRUE) stdev <- sd(x, na.rm = TRUE) out <- tibble::tibble(n_valid = n, # Bundle it all n_missing = nmiss, mean = mn, sd = stdev) out}
describe <- function(x) { n <- as.integer(length(na.omit(x))) nmiss <- as.integer(sum(is.na(x))) mn <- mean(x, na.rm = TRUE) stdev <- sd(x, na.rm = TRUE) out <- tibble::tibble(n_valid = n, n_missing = nmiss, mean = mn, sd = stdev) out # Return the tibble}
describe(rnorm(100))
## # A tibble: 1 x 4## n_valid n_missing mean sd## <int> <int> <dbl> <dbl>## 1 100 0 0.0203 1.10
describe(c(rnorm(1000, 10, 4), rep(NA, 27)))
## # A tibble: 1 x 4## n_valid n_missing mean sd## <int> <int> <dbl> <dbl>## 1 1000 27 10.0 4.20
Package skeleton:
usethis::create_package
usethis::use_r
roxygen2
special comments for documentationdevtools::document
Typical arguments
@param
: Describe the formal arguments. State argument name and the describe it.
#' @param x Vector to describe
@return
: What does the function return
#' @return A tibble with descriptive data
@example
or more commonly @examples
: Provide examples of the use of your function.@export
: Export your functionIf you don't include @export
, your function will be internal, meaning others can't access it easily.
.gitignore
: Files to ignore for git commits with some pre-slugged entriesNAMESPACE
: Created by {roxygen2}. Don't edit it. If you need to, trash it and it will be reproduced. DESCRIPTION
: Describes your package (more on next slide)man/
: The documentation files. Created by {roxygen2}. Don't edit.DESCRIPTION
Metadata about the package. Default fields for our package are
Package: practiceVersion: 0.0.0.9000Title: What the Package Does (One Line, Title Case)Description: What the package does (one paragraph).Authors@R: person("First", "Last", email = "first.last@example.com", role = c("aut", "cre"))License: What license is it under?Encoding: UTF-8LazyData: trueByteCompile: trueRoxygenNote: 6.0.1
DESCRIPTION
Metadata about the package. Default fields for our package are
Package: practiceVersion: 0.0.0.9000Title: What the Package Does (One Line, Title Case)Description: What the package does (one paragraph).Authors@R: person("First", "Last", email = "first.last@example.com", role = c("aut", "cre"))License: What license is it under?Encoding: UTF-8LazyData: trueByteCompile: trueRoxygenNote: 6.0.1
This is where the information for citation(package = "practice")
will come from.
DESCRIPTION
Metadata about the package. Default fields for our package are
Package: practiceVersion: 0.0.0.9000Title: What the Package Does (One Line, Title Case)Description: What the package does (one paragraph).Authors@R: person("First", "Last", email = "first.last@example.com", role = c("aut", "cre"))License: What license is it under?Encoding: UTF-8LazyData: trueByteCompile: trueRoxygenNote: 6.0.1
This is where the information for citation(package = "practice")
will come from.
Some advice - edit within RStudio, or a good text editor like sublimetext. "Fancy" quotes and things can screw this up.
The ‘Package’, ‘Version’, ‘License’, ‘Description’, ‘Title’, ‘Author’, and ‘Maintainer’ fields are mandatory, all other fields are optional. - Writing R Extensions
Some optional fields include
DESCRIPTION
for {esvis}Package: esvisType: PackageTitle: Visualization and Estimation of Effect SizesVersion: 0.1.0.9000Authors@R: person("Daniel", "Anderson", email = "daniela@uoregon.edu", role = c("aut", "cre"))Description: A variety of methods are provided to estimate and visualize distributional differences in terms of effect sizes. Particular emphasis is upon evaluating differences between two or more distributions across the entire scale, rather than at a single point (e.g., differences in means). For example, Probability-Probability (PP) plots display the difference between two or more distributions, matched by their empirical CDFs (see Ho and Reardon, 2012; <doi:10.3102/1076998611411918>), allowing for examinations of where on the scale distributional differences are largest or smallest. The area under the PP curve (AUC) is an effect-size metric, corresponding to the probability that a randomly selected observation from the x-axis distribution will have a higher value than a randomly selected observation from the y-axis distribution. Binned effect size plots are also available, in which the distributions are split into bins (set by the user) and separate effect sizes (Cohen's d) are produced for each bin - again providing a means to evaluate the consistency (or lack thereof) of the difference between two or more distributions at different points on the scale. Evaluation of empirical CDFs is also provided, with built-in arguments for providing annotations to help evaluate distributional differences at specific points (e.g., semi-transparent shading). All function take a consistent argument structure. Calculation of specific effect sizes is also possible. The following effect sizes are estimable: (a) Cohen's d, (b) Hedges' g, (c) percentage above a cut, (d) transformed (normalized) percentage above a cut, (e) area under the PP curve, and (f) the V statistic (see Ho, 2009; <doi:10.3102/1076998609332755>), which essentially transforms the area under the curve to standard deviation units. By default, effect sizes are calculated for all possible pairwise comparisons, but a reference group (distribution) can be specified.
DESCRIPTION
for {esvis} (continued)Depends: R (>= 3.1)Imports: sfsmiscURL: https://github.com/DJAnderson07/esvisBugReports: https://github.com/DJAnderson07/esvis/issuesLicense: MIT + file LICENSELazyData: trueRoxygenNote: 6.0.1Suggests: testthat, viridisLite
usethis::use_mit_license("First and Last Name")
tibble
function within the {tibble} package. tibble
function within the {tibble} package. usethis::use_package
usethis::use_package_doc
#' importFrom pkg fun_name
tibble::tibble
becomes just plain old tibble
). The likelihood of conflicts is also reduced, so long as you don't import the full package.What does it mean to write tests?
Why write tests?
What does it mean to write tests?
Why write tests?
How do you write tests?
usethis::use_testthat
sets up the infrastructuretestthat::expect_equal()
, testthat::expect_warning()
, testthat::expect_error()
We'll skip over testing for today, because we just don't have time to cover everything. A few good resources:
devtools::check()
to run the same checks CRAN will run on your R package.devtools::build_win()
to run the checks on CRAN computers.devtools::check()
to run the same checks CRAN will run on your R package.devtools::build_win()
to run the checks on CRAN computers.The first time, you'll likely get errors. Be patient. It will probably be frustrating, but ultimately worth the effort.
README
with usethis::use_readme_rmd
.Create a README
with usethis::use_readme_rmd
.
Try to get your code coverage up above 80%.
Create a README
with usethis::use_readme_rmd
.
Try to get your code coverage up above 80%.
Automate wherever possible ({devtools} and {usethis} help a lot with this)
Create a README
with usethis::use_readme_rmd
.
Try to get your code coverage up above 80%.
Automate wherever possible ({devtools} and {usethis} help a lot with this)
Use the {goodpractice} package to help you package code be more robust, specifically with goodpractice::gp()
. It will give you lots of good ideas
Create a README
with usethis::use_readme_rmd
.
Try to get your code coverage up above 80%.
Automate wherever possible ({devtools} and {usethis} help a lot with this)
Use the {goodpractice} package to help you package code be more robust, specifically with goodpractice::gp()
. It will give you lots of good ideas
Host on GitHub, and capitalize on integration with other systems (all free, but require registering for an account)
usethis::use_git
, followed by usethis::use_github
.For this to work, you’ll need to set a GITHUB_PAT environment variable in your ~/.Renviron. Follow Jenny Bryan’s instructions, and use
edit_r_environ()
to easily access the right file for editing
Note: I haven't played around with this much. Standard git procedures will work too.
README
usethis::use_readme_rmd
. README
.usethis::use_travis
and usethis::use_appveyor
to get started.README
.usethis::use_travis
and usethis::use_appveyor
to get started.README
.You can test your code coverage each time you push a new commit by using codecov. Initialize with usethis::use_coverage()
. Overall setup process is pretty similar to Travis CI/Appveyor.
Easily see what is/is not covered by tests!
If you'd like to follow along, pleae make sure you have the following packages installed
install.packages(c("tidyverse", "esvis", "devtools", "roxygen2", "usethis"))
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |