diff --git a/.Rbuildignore b/.Rbuildignore index 8846e86..26e9799 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -12,3 +12,4 @@ .document.Rout .install_dev_deps.Rout ^\.github$ +^cran-comments\.md$ diff --git a/DESCRIPTION b/DESCRIPTION index bc22c66..66cb56f 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,8 @@ Package: cpr Title: Control Polygon Reduction -Version: 0.2.3.9005 +Version: 0.3.0 Authors@R: - c(person("Peter", "DeWitt", email = "dewittpe@gmail.com", role = c("aut", "cre")), + c(person("Peter", "DeWitt", email = "dewittpe@gmail.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0002-6391-0795")), person("Samantha", "MaWhinney", email = "sam.mawhinney@ucdenver.edu", role = c("ths")), person("Nichole", "Carlson", email = "nichole.carlson@ucdenver.edu", role = c("ths"))) Description: Implementation of the Control Polygon Reduction and Control Net @@ -12,14 +12,15 @@ Depends: License: GPL (>= 2) Encoding: UTF-8 URL: https://github.com/dewittpe/cpr/ +Language: en-us LazyData: true Imports: - ggplot2 (>= 2.2.0), + ggplot2 (>= 3.0.0), lme4, plot3D, - Rcpp (>= 0.12.13), + Rcpp (>= 1.0.11), scales -LinkingTo: Rcpp (>= 0.12.13), +LinkingTo: Rcpp (>= 1.0.11), RcppArmadillo Suggests: covr, @@ -27,5 +28,5 @@ Suggests: qwraps2, rmarkdown, rgl -RoxygenNote: 7.2.1 +RoxygenNote: 7.2.3 VignetteBuilder: knitr diff --git a/Makefile b/Makefile index c91dcd8..7f7d500 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,6 @@ CRAN = "https://cran.rstudio.com" SRC = $(wildcard $(PKG_ROOT)/src/*.cpp) RFILES = $(wildcard $(PKG_ROOT)/R/*.R) -EXAMPLES = $(wildcard $(PKG_ROOT)/examples/*.R) TESTS = $(wildcard $(PKG_ROOT)/tests/testthat/*.R) VIGNETTES = $(wildcard $(PKG_ROOT)/vignettes/*) RAWDATAR = $(wildcard $(PKG_ROOT)/data-raw/*.R) @@ -22,7 +21,7 @@ all: $(PKG_NAME)_$(PKG_VERSION).tar.gz -e "options(warn = 2)" @touch $@ -.document.Rout: $(RFILES) $(SRC) $(EXAMPLES) $(RAWDATAR) $(VIGNETTES) $(PKG_ROOT)/DESCRIPTION +.document.Rout: $(RFILES) $(SRC) $(RAWDATAR) $(VIGNETTES) $(PKG_ROOT)/DESCRIPTION if [ -e "$(PKG_ROOT)/data-raw/Makefile" ]; then $(MAKE) -C $(PKG_ROOT)/data-raw/; else echo "Nothing to do"; fi Rscript --vanilla --quiet -e "options(repo = c('$(CRAN)', '$(BIOC)'))" \ -e "options(warn = 2)" \ @@ -44,4 +43,6 @@ clean: $(RM) -rf $(PKG_NAME).Rcheck $(RM) -f .document.Rout $(RM) -f .install_dev_deps.Rout + $(RM) -f src/*.o + $(RM) -f src/*.so diff --git a/NAMESPACE b/NAMESPACE index dca7cda..1f3cd23 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -7,6 +7,8 @@ S3method(cn,formula) S3method(cnr,cpr_cn) S3method(cp,cpr_bs) S3method(cp,formula) +S3method(cp_value,cpr_cp) +S3method(cp_value,default) S3method(cpr,cpr_cp) S3method(get_spline,cpr_cn) S3method(get_spline,cpr_cp) diff --git a/NEWS.md b/NEWS.md index 5192f16..c123059 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,21 +1,20 @@ -# Version 0.2.3.9005 +# Version 0.3.0 ## New Examples * `cpr` has examples -## Vignettes -* The package overview has been changed from .Rmd to .Rnw and will be a draft of - a JSS submission. -* A new vignette focused on just B-splines has been started. - ## Other Changes -* Depends on Rcpp >= 0.12.11 to handle registering native routines. -* Moves rgl from `Imports` to `Enhances` (re #36) +* Depends on Rcpp >= 0.12.11 (actually moved to >= 1.0.11) to handle registering native routines. +* Moves rgl from `Imports` to `Suggests` (re #36) * Refactoring base code to eliminate the use of dplyr, tidyr, tibble, etc. Focus on base R methods to reduce install dependencies and improve long term stability of the package. * Require R > 3.5.0 * Stop using testthat for testing +* Remove use of the tidyr, dplyr +* Improve documentation +* Minor bug fixes +* Replace use of now deprecated `ggplot2::aes_string` # Version 0.2.3 First public release. @@ -48,7 +47,7 @@ Documentation improvements. * `is.` a collection of `is.cpr_cp`, `is.cpr_bs`, ... functions added. * The dataset `spdg` has been added to the package. -## Non-User Visible Changes +## Other Changes * removed a redundant `build_tensor` definition # Version 0.2.0 @@ -75,12 +74,12 @@ Continued development should be focused on bug fixes and minor enhancements. sequence and numeric values in `plot.cpr_bs` (#18) ## Bug Fixes -* `from` and `to` args for `plot.cpr_cpr` fixed (#14) +* `from` and `to` arguments for `plot.cpr_cpr` fixed (#14) * correct construction of missing `iknots` argument in `btensor` * `keep` is correctly handled in the `cnr` call. * `show_xi` correctly handled in the `plot.cpr_cp` call. -## Non-User visible changes +## Non visible changes * non-exported function `knot_expr` created to help with plotting the knot locations in `cpr:::plot.cpr_bs`. @@ -91,7 +90,7 @@ Continued development should be focused on bug fixes and minor enhancements. When plotting multiple control polygons and splines, this option will make it easier to view the spline functions. -# Non-visible changes: +## Non visible changes * Extended testing scripts. # Version 0.1.0 @@ -118,7 +117,7 @@ series. The aim for version 0.2.0 will be to have a very similar API for ## End User non-visible changes: * Added the not-to-be-exported function `generate_cp_data` -* Redesign of the deboor.cpp file so that the bsplines are accessible. The +* Redesign of the `deboor.cpp` file so that the `bsplines` are accessible. The prior design only allowed access to the basis, the current design allows access to the generic B-splines. @@ -131,8 +130,8 @@ Biometric Society, Student paper competition. The conference will be held 10 - 16 July 2016 in Victoria, British Columbia, Canada. ## Bug Fixes -* Corrected the attr calls within `cpr` after adjusting the attributes being set - on a `cpr_cp`. +* Corrected the attributes calls within `cpr` after adjusting the attributes + being set on a `cpr_cp`. * `plot.cpr_bs` correctly displays the indices for the knot sequence. diff --git a/R/RcppExports.R b/R/RcppExports.R index 60b5d11..0b3fcd9 100644 --- a/R/RcppExports.R +++ b/R/RcppExports.R @@ -2,7 +2,7 @@ # Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393 #' Knot Insertion, Removal, and Reinsertion -#' +#' #' Functions for the insertion, removal, and reinsertion of internal knots for #' B-splines. #' @@ -24,7 +24,7 @@ #' in knot insertion matrix. #' #' Examples for the \code{refine_ordinate}, \code{coarsen_ordinate}, and -#' \code{hat_ordinate} are best shown in the vignette, +#' \code{hat_ordinate} are best shown in the vignette, #' \code{vignette("cpr-pkg", package = "cpr")}. #' #' \code{iknot_weights} returns a vector with the 'importance weight' of each @@ -94,15 +94,16 @@ diag_only <- function(A, B) { #' #' Determine the rank (number of linearly independent columns) of a matrix. #' -#' Implimentation via the Armadillo C++ linear algrebra library. The function -#' returns the rank of the matix \code{x}. The computation is based on the -#' singular value decomposition of the matrix; a std::runtime_error excetion +#' Implementation via the Armadillo C++ linear algebra library. The function +#' returns the rank of the matrix \code{x}. The computation is based on the +#' singular value decomposition of the matrix; a std::runtime_error exception #' will be thrown if the decomposition fails. Any singular values less than -#' the tolerance are treated as zeros. The tolerance is max(m, n) * max_sv * -#' datum::eps, where m is the number of rows of x, n is the number of columns -#' of x, max_sv is the maximal singular value of x, and datum::eps is the -#' difference between 1 and the least value greater than 1 that is -#' representable. +#' the tolerance are treated as zeros. The tolerance is +#' \code{max(m, n) * max_sv * arma::datum::eps}, where \code{m} is the number +#' of rows of \code{x}, \code{n} is the number of columns of \code{x}, +#' \code{max_sv} is the maximal singular value of \code{x}, and +#' \code{arma::datum::eps} is the difference between 1 and the least value +#' greater than 1 that is representable. #' #' @param x a numeric matrix #' diff --git a/R/bsplines.R b/R/bsplines.R index 2b65f4b..70159cd 100644 --- a/R/bsplines.R +++ b/R/bsplines.R @@ -120,6 +120,8 @@ print.cpr_bs <- function(x, n = 6L, ...) { #' plot(bmat, show_xi = FALSE, show_x = TRUE) #' plot(bmat, show_xi = TRUE, show_x = FALSE) ## Default #' plot(bmat, show_xi = FALSE, show_x = FALSE) +#' plot(bmat, show_xi = FALSE, show_x = FALSE) +#' plot(bmat, show_xi = FALSE, show_x = FALSE, color = FALSE) #' @method plot cpr_bs #' @export plot.cpr_bs <- function(x, ..., show_xi = TRUE, show_x = FALSE, color = TRUE, digits = 2, n = 100) { @@ -131,22 +133,22 @@ plot.cpr_bs <- function(x, ..., show_xi = TRUE, show_x = FALSE, color = TRUE, di names(plot_data) <- c("value", "spline") plot_data <- cbind(plot_data, data.frame(x = rep(xvec, times = ncol(bmat)))) levels(plot_data$spline) <- sub("V", "B", levels(plot_data$spline)) - levels(plot_data$spline) <- sub("(\\d+)", + levels(plot_data$spline) <- sub("(\\d+)", paste0("[list(\\1,k==", attr(x, "order"), ",bold(xi))](x)"), levels(plot_data$spline)) g <- ggplot2::ggplot(plot_data) + ggplot2::theme_bw() + - ggplot2::aes_string(x = "x", y = "value") + + eval(substitute(ggplot2::aes(x = X, y = Y), list(X = as.name("x"), Y = as.name("value")))) + ggplot2::geom_line() + ggplot2::theme(axis.title = ggplot2::element_blank()) if (color) { - g <- g + ggplot2::aes_string(color = "spline") + g <- g + eval(substitute(ggplot2::aes(color = GRP), list(GRP = as.name("spline")))) g <- g + ggplot2::scale_color_discrete(labels = scales::parse_format()) } else { - g <- g + ggplot2::aes_string(group = "spline") + g <- g + eval(substitute(ggplot2::aes(group = GRP), list(GRP = as.name("spline")))) } if (show_xi | show_x) { diff --git a/R/build_tensor.R b/R/build_tensor.R index d106ef1..7139705 100644 --- a/R/build_tensor.R +++ b/R/build_tensor.R @@ -5,10 +5,25 @@ #' @param x a matrix, or list of numeric matrices, build the tensor product #' @param ... additional numeric matrices to build the tensor product #' +#' @seealso +#' \code{vignette("cpr-pkg", package = "cpr")} for details on tensor products. +#' #' @return #' A matrix #' -#' @example examples/build_tensor.R +#' @examples +#' +#' A <- matrix(1:4, nrow = 10, ncol = 20) +#' B <- matrix(1:6, nrow = 10, ncol = 6) +#' +#' # Two ways of building the same tensor product +#' tensor1 <- build_tensor(A, B) +#' tensor2 <- build_tensor(list(A, B)) +#' all.equal(tensor1, tensor2) +#' +#' # a three matrix tensor product +#' tensor3 <- build_tensor(A, B, B) +#' str(tensor3) #' #' @export build_tensor <- function(x, ...) { diff --git a/R/cnr_diagnostics.R b/R/cnr_diagnostics.R index 649da32..dce51a9 100644 --- a/R/cnr_diagnostics.R +++ b/R/cnr_diagnostics.R @@ -8,7 +8,7 @@ #' @export #' @param x a \code{cpr_cnr} object #' @param type type of diagnostic plot. -#' \code{"loglik"} for the loglikihood by degrees of freedom, +#' \code{"loglik"} for the log likelihood by degrees of freedom, #' \code{"rmse"} for root mean squared residuals by model index #' @param from the first index of \code{x} to plot #' @param to the last index of \code{x} to plot @@ -30,7 +30,7 @@ plot.cpr_cnr <- function(x, type = "rmse", from = 1, to, ...) { ggplot2::ggplot(subset(s, (s$index >= from) & (s$index <= to))) + ggplot2::theme_bw() + - ggplot2::aes_string(x = "index", y = type) + + eval(substitute(ggplot2::aes(x = X, y = Y), list(X = as.name("index"), Y = as.name(type)))) + ggplot2::geom_point() + ggplot2::geom_line() + ggplot2::xlab("Index") diff --git a/R/cp.R b/R/cp.R index 378d8e1..7a27b98 100644 --- a/R/cp.R +++ b/R/cp.R @@ -35,11 +35,11 @@ #' #' # via formula #' dat <- data.frame(x = xvec, y = sin((xvec - 2)/pi) + 1.4 * cos(xvec/pi)) -#' cp3 <- cp(y ~ cpr::bsplines(x) + 0, data = dat) +#' cp3 <- cp(y ~ cpr::bsplines(x), data = dat) #' #' # plot the control polygon, spline and target data. #' plot(cp3, show_spline = TRUE) + -#' ggplot2::geom_line(mapping = ggplot2::aes_string(x = "x", y = "y"), +#' ggplot2::geom_line(mapping = ggplot2::aes(x = x, y = y), #' data = dat, linetype = 2, color = "red") #' #' @export @@ -158,7 +158,7 @@ summary.cpr_cp <- function(object, wiggle = FALSE, integrate.args = list(), ...) wggl <- try(do.call(wiggle.cpr_cp, c(list(object = object), integrate.args)), silent = TRUE) - if (class(wggl) == "integrate") { + if (inherits(x = wggl, what = "integrate")) { out$wiggle <- as.numeric(wggl$value) attr(out$wiggle, "abs.error") <- wggl$abs.error attr(out$wiggle, "subdivisions") <- wggl$subdivisions @@ -182,7 +182,7 @@ summary.cpr_cp <- function(object, wiggle = FALSE, integrate.args = list(), ...) #' \code{\link[ggplot2]{geom_rug}} to show the location of the knots in the #' respective control polygons. #' @param color Boolean (default FALSE) if more than one \code{cpr_cp} object is -#' to be plotted, set this value to TRUE to have the graphic in color (linetypes +#' to be plotted, set this value to TRUE to have the graphic in color (line types #' will be used regardless of the color setting). #' @param n the number of data points to use for plotting the spline #' @@ -195,10 +195,10 @@ plot.cpr_cp <- function(x, ..., show_cp = TRUE, show_spline = FALSE, show_xi = T spline_data <- lapply(list(x, ...), function(xx) { b <- xx$bknots - bmat <- cpr::bsplines(seq(b[1], b[2], length = n), - iknots = xx$iknots, - bknots = b, - order = xx$order) + bmat <- bsplines(seq(b[1], b[2], length = n), + iknots = xx$iknots, + bknots = b, + order = xx$order) data.frame(x = seq(b[1], b[2], length = n), y = as.numeric(bmat %*% xx$cp$theta)) }) @@ -230,7 +230,7 @@ plot.cpr_cp <- function(x, ..., show_cp = TRUE, show_spline = FALSE, show_xi = T base_plot <- ggplot2::ggplot(plot_data) + ggplot2::theme_bw() + - ggplot2::aes_string(x = "x", y = "y") + + eval(substitute(ggplot2::aes(x = X, y = Y), list(X = as.name("x"), Y = as.name("y")))) + ggplot2::theme(axis.title = ggplot2::element_blank()) if (show_xi) { @@ -255,14 +255,14 @@ plot.cpr_cp <- function(x, ..., show_cp = TRUE, show_spline = FALSE, show_xi = T if (length(cps) > 1) { base_plot <- base_plot + - ggplot2::aes_string(linetype = "row") + + eval(substitute(ggplot2::aes(linetype = LTY), list(LTY = as.name("row")))) + ggplot2::theme(legend.title = ggplot2::element_blank()) } if (color) { base_plot <- base_plot + - ggplot2::aes_string(color = "row") + + eval(substitute(ggplot2::aes(color = CLR), list(CLR = as.name("row")))) + ggplot2::theme(legend.title = ggplot2::element_blank()) } diff --git a/R/cp_diagnostics.R b/R/cp_diagnostics.R index 41bcd9d..a126033 100644 --- a/R/cp_diagnostics.R +++ b/R/cp_diagnostics.R @@ -4,23 +4,24 @@ #' #' @return #' \code{cp_value} returns the ordinate on the control polygon line segment for -#' the abscissae \code{x} given. \code{x} could be a control vertex or on a +#' the abscissa \code{x} given. \code{x} could be a control vertex or on a #' line segment defined by two control vertices of the control polygon #' provided. #' #' \code{cp_diff} returns the absolute vertical distance between the control #' vertices of cp1 to the control polygon cp2. #' -#' @export -#' @rdname cp_diagnostics -#' @param x absicissa at which to determine the ordinate on control polygon cp +#' @param x abscissa at which to determine the ordinate on control polygon cp #' @param obj a cpr_cp object or \code{data.frame} where the first column is the #' abscissa and the second column is the ordinate for the control polygon vertices. +#' +#' @export +#' @rdname cp_diagnostics cp_value <- function(obj, x) { UseMethod("cp_value") } -#' @method cp_value cpr_cp +#' @export cp_value.cpr_cp <- function(obj, x) { xi_star <- obj$xi_star theta <- obj$theta @@ -30,7 +31,7 @@ cp_value.cpr_cp <- function(obj, x) { unname((theta[idx] - theta[idx - 1L]) / (xi_star[idx] - xi_star[idx - 1L]) * (x - xi_star[idx]) + theta[idx]) } -#' @method cp_value default +#' @export cp_value.default <- function(obj, x) { xi_star <- obj[[1]] theta <- obj[[2]] diff --git a/R/cpr-package.R b/R/cpr-package.R index 7b8a695..a90f7a1 100644 --- a/R/cpr-package.R +++ b/R/cpr-package.R @@ -2,10 +2,14 @@ #' #' @description #' The cpr package implements the control polygon reduction and control net -#' reduction methods for finding parsimonious B-spline regression models. +#' reduction methods for finding parsimonious B-spline regression models as +#' described in DeWitt (2017). #' -#' This package was part of Peter DeWitt's Ph.D. dissertation which was overseen -#' by his advisors Samantha MaWhinney and Nichole Carlson. +#' @references +#' DeWitt, Peter Edward. \dQuote{Parsimonious B-spline regression models via +#' control polygon and control net reduction for identifying factors explaining +#' variation in daily hormone profiles during the menopausal transition.} +#' (2017). #' #' @useDynLib cpr, .registration = TRUE #' @importFrom Rcpp sourceCpp diff --git a/R/cpr.R b/R/cpr.R index a3d045a..aab37a0 100644 --- a/R/cpr.R +++ b/R/cpr.R @@ -22,7 +22,78 @@ #' @param progress show a progress bar. #' @param ... not currently used #' -#' @example examples/cpr.R +#' @examples +#' ############################################################################# +#' # Example 1: find a model for log10(pdg) = f(day) from the spdg data set +#' \dontrun{ +#' # need the lme4 package to fit a mixed effect model +#' require(lme4) +#' +#' # construct the initial control polygon. Forth order spline with fifty +#' # internal knots. Remember degrees of freedom equal the polynomial order +#' # plus number of internal knots. +#' init_cp <- cp(log10(pdg) ~ bsplines(day, df = 54) + (1|id), +#' data = spdg, method = lme4::lmer) +#' cpr_run <- cpr(init_cp) +#' plot(cpr_run, color = TRUE) +#' plot(cpr_run, type = "rmse", to = 10) +#' +#' # preferable model is in index 4 +#' preferable_cp <- cpr_run[[4]] +#' } +#' +#' ############################################################################# +#' # Example 2: logistic regression +#' # simulate a binary response Pr(y = 1 | x) = p(x) +#' p <- function(x) { 0.65 * sin(x * 0.70) + 0.3 * cos(x * 4.2) } +#' +#' set.seed(42) +#' x <- runif(2500, 0.00, 4.5) +#' sim_data <- data.frame(x = x, y = rbinom(2500, 1, p(x))) +#' +#' # Define the initial control polygon +#' init_cp <- cp(formula = y ~ bsplines(x, df = 54, bknots = c(0, 4.5)), +#' family = binomial(), +#' method = glm, +#' data = sim_data) +#' +#' # run CPR, preferable model is in index 7 +#' cpr_run <- cpr(init_cp) +#' plot(cpr_run, color = TRUE, type = "rmse", to = 15) +#' plot(cpr_run, color = TRUE, from = 11, to = 15, show_spline = TRUE) +#' +#' # plot the fitted spline and the true p(x) +#' sim_data$pred_select_p <- +#' plogis(predict(cpr_run[[7]], newdata = sim_data)$pred) +#' +#' ggplot2::ggplot(sim_data) + +#' ggplot2::theme_bw() + +#' ggplot2::aes(x = x) + +#' ggplot2::geom_point(mapping = ggplot2::aes(y = y), alpha = 0.1) + +#' ggplot2::geom_line(mapping = ggplot2::aes(y = pred_select_p, color = "pred_select_p")) + +#' ggplot2::stat_function(fun = p, mapping = ggplot2::aes(color = 'p(x)')) +#' +#' # compare to gam and a binned average +#' sim_data$x2 <- round(sim_data$x, digits = 1) +#' bin_average <- +#' lapply(split(sim_data, sim_data$x2), function(x) { +#' data.frame(x = x$x2[1], y = mean(x$y)) +#' }) +#' bin_average <- do.call(rbind, bin_average) +#' +#' ggplot2::ggplot(sim_data) + +#' ggplot2::theme_bw() + +#' ggplot2::aes(x = x) + +#' ggplot2::stat_function(fun = p, mapping = ggplot2::aes(color = 'p(x)')) + +#' ggplot2::geom_line(mapping = ggplot2::aes(y = pred_select_p, color = "pred_select_p")) + +#' ggplot2::stat_smooth(mapping = ggplot2::aes(y = y, color = "gam"), +#' method = "gam", +#' formula = y ~ s(x, bs = "cs"), +#' se = FALSE, +#' n = 1000) + +#' ggplot2::geom_line(data = bin_average +#' , mapping = ggplot2::aes(y = y, color = "bin_average")) +#' #' #' @export cpr <- function(x, keep = -1, p = 2, progress = interactive(), ...) { diff --git a/R/cpr_diagnostics.R b/R/cpr_diagnostics.R index cb6f542..2ddaabd 100644 --- a/R/cpr_diagnostics.R +++ b/R/cpr_diagnostics.R @@ -8,11 +8,11 @@ #' @export #' @param x a \code{cpr_cpr} object #' @param type type of diagnostic plot. \code{"cps"} for control polygons, -#' \code{"loglik"} for the loglikihood by degrees of freedom, +#' \code{"loglik"} for the log likelihood by degrees of freedom, #' \code{"rmse"} for root mean squared residuals by degrees of freedom #' @param from the first index of \code{x} to plot #' @param to the last index of \code{x} to plot -#' @param ... args passed to \code{cpr::plot.cpr_cp} +#' @param ... arguments passed to \code{cpr::plot.cpr_cp} #' @seealso \code{\link{plot.cpr_cp}} plot.cpr_cpr <- function(x, type = "cps", from = 1, to, ...) { @@ -46,7 +46,7 @@ plot.cpr_cpr <- function(x, type = "cps", from = 1, to, ...) { ggplot2::ggplot(subset(s, (s$index >= from) & (s$index <= to))) + ggplot2::theme_bw() + - ggplot2::aes_string(x = "index", y = type) + + eval(substitute(ggplot2::aes(x = X, y = Y), list(X = as.name("index"), Y = as.name(type)))) + ggplot2::geom_point() + ggplot2::geom_line() + ggplot2::xlab("Index") diff --git a/R/extract_cpr_bsplines.R b/R/extract_cpr_bsplines.R index d20c01d..bb1cd4e 100644 --- a/R/extract_cpr_bsplines.R +++ b/R/extract_cpr_bsplines.R @@ -1,6 +1,6 @@ #' Extract the bspline or btensor call from a formula #' -#' Non-exported function. Might be depericated. +#' Non-exported function. Might be depreciated. #' #' @param form a formula extract_cpr_bsplines <- function(form) { diff --git a/R/generate_cp_formula_data.R b/R/generate_cp_formula_data.R index 1fb2149..9ada0e1 100644 --- a/R/generate_cp_formula_data.R +++ b/R/generate_cp_formula_data.R @@ -15,7 +15,8 @@ #' factor, then the model matrix will again be rank deficient as a column for #' all levels of the factor will be generated. We need to replace the intercept #' column of the model matrix with the bspline. This also needs to be done for -#' a variety of possible model calls, lm, lmer, etc. +#' a variety of possible model calls, \code{\link[stats]{lm}}, +#' \code{\link{lme4}{lmer}}, etc. #' #' By returning an explicit \code{formula} and \code{data.frame} for use in the #' fit, we hope to reduce memory use and increase the speed of the cpr method. diff --git a/R/get_spline.R b/R/get_spline.R index 48c9617..d00f3a6 100644 --- a/R/get_spline.R +++ b/R/get_spline.R @@ -102,7 +102,31 @@ get_spline.cpr_cn <- function(x, margin = 1, at, n = 100) { #' #' @seealso \code{\link{get_spline}} #' -#' @example examples/get_surface.R +#' @examples +#' ## Extract the control net and surface from a cpr_cn object. +#' a_cn <- cn(pdg ~ btensor(list(day, age), df = list(15, 3), order = list(3, 2)), +#' data = spdg) +#' +#' cn_and_surface <- get_surface(a_cn, n = 50) +#' str(cn_and_surface, max.level = 2) +#' +#' par(mfrow = c(1, 2)) +#' with(cn_and_surface$cn, +#' plot3D::persp3D(unique(Var1), +#' unique(Var2), +#' matrix(z, +#' nrow = length(unique(Var1)), +#' ncol = length(unique(Var2))), +#' main = "Control Net") +#' ) +#' with(cn_and_surface$surface, +#' plot3D::persp3D(unique(Var1), +#' unique(Var2), +#' matrix(z, +#' nrow = length(unique(Var1)), +#' ncol = length(unique(Var2))), +#' main = "Surface") +#' ) #' #' @export get_surface <- function(x, margin = 1:2, at, n = 100) { diff --git a/R/loglikelihood.R b/R/loglikelihood.R index fec6d7f..b63967f 100644 --- a/R/loglikelihood.R +++ b/R/loglikelihood.R @@ -1,22 +1,22 @@ #' Determine the (quasi) Log Likelihood for a regression object. #' -#' Return, via stats::logLik or a custom S3 method, the (quasi) log likelihood -#' of a regression object. +#' Return, via \code{\link[stats]{logLik}} or a custom S3 method, the (quasi) +#' log likelihood of a regression object. #' #' This function is used by \code{cpr::cpr} and \code{cpr::cnr} to determine the #' (quasi) log likelihood returned in the \code{cpr_cpr} and \code{cpr_cnr} #' objects. #' -#' Generally this function defaults to stats::logLik. Therefore, if an S3 +#' Generally this function defaults to \code{\link[stats]{logLik}}. Therefore, if an S3 #' method for determining the (quasi) log likelihood exists in the workspace #' everything should work. If an S3 method does not exist you should define #' one. #' #' See \code{methods(loglikelihood)} for a list of the provided methods. The -#' default method uses \code{stats::logLik}. +#' default method uses \code{\link[stats]{logLik}}. #' #' @param x a regression fit object -#' @param ... passed through to \code{stats::logLik} +#' @param ... passed through to \code{\link[stats]{logLik}} #' #' @return the numeric value of the (quasi) log likelihood. #' diff --git a/R/plot.cpr_cn.R b/R/plot.cpr_cn.R index c3ed8ed..4c03f0e 100644 --- a/R/plot.cpr_cn.R +++ b/R/plot.cpr_cn.R @@ -6,6 +6,8 @@ #' surface, or both, for a \code{cpr_cn} objects. The three-dimensional plots #' are generated by either \code{\link[plot3D]{persp3D}} form the \code{plot3D} #' package or \code{\link[rgl]{persp3d}} from the \code{rgl} package. +#' \code{rgl} graphics may or may not work on your system depending on support +#' for OpenGL. #' #' Building complex and customized graphics might be easier for you if you use #' \code{\link{get_surface}} to generate the needed data for plotting. See diff --git a/R/spdg.R b/R/spdg.R index f89a060..d930d09 100644 --- a/R/spdg.R +++ b/R/spdg.R @@ -1,9 +1,10 @@ -#' Simulated Prognanediol-glucuronid (PDG) Data +#' Simulated Pregnanediol glucuronide (PDG) Data #' #' A Simulated data set based on the Study of Women's Health Across the Nation #' (SWAN) Daily Hormone Study (DHS). #' -#' Prognanediol-glucuronid (PDG) is the urine metabolite of progesterone. This +#' +#' Pregnanediol glucuronide (PDG) is the urine metabolite of progesterone. This #' data set was simulated to have similar #' characteristics to a subset of the SWAN DHS data. The SWAN DHS data was the #' motivating data set for the method development that lead to the \code{cpr} @@ -39,5 +40,3 @@ #' set please visit \url{https://github.com/dewittpe/cpr} and look at the #' scripts in the data-raw directory. "spdg" - - diff --git a/R/trimmed_quantile.R b/R/trimmed_quantile.R index ca2da7f..a5cd669 100644 --- a/R/trimmed_quantile.R +++ b/R/trimmed_quantile.R @@ -1,8 +1,10 @@ #' Trimmed Quantiles #' -#' For data X = x1, x2, ..., xn, with order statistics x(1), x(2), ..., x(r) return -#' the quantiles for a trimmed data set, e.g., X \ {x(1), x(r)} (trim = 1), or -#' X \ {x(1), x(2), x(r-1), x(r)} (trim = 2). +#' For data \eqn{X = x_1, x_2, \ldots, x_n}{X = x1, x2, ..., xn}, with order +#' statistics \eqn{x_{(1)}, x_{(2)}, \ldots, x_{(r)}}{x(1), x(2), ..., x(r)} +#' return the quantiles for a trimmed data set, e.g., +#' \eqn{\boldsymbol{X} \backslash \{x_{(1)}, x_{(r)}\}}{X \ {x(1), x(r)}} (trim = 1), or +#' \eqn{\boldsymbol{X} \backslash \{x_{(1)}, x_{(2)}, x_{(r-1)}, x_{(r)}\}}{X \ {x(1), x(2), x(r-1), x(r)}} (trim = 2). #' #' @param x a numeric vector #' @param trim defaults to 1, omitting the min and the max diff --git a/R/update_bsplines.R b/R/update_bsplines.R index 9dcac91..d044121 100644 --- a/R/update_bsplines.R +++ b/R/update_bsplines.R @@ -11,7 +11,47 @@ #' @seealso \code{\link[stats]{update}}, \code{\link{bsplines}}, #' \code{\link{btensor}} #' -#' @example examples/update_bsplines.R +#' @examples +#' ########################### Updating a cpr_bs object ########################### +#' # construct a B-spline basis +#' bmat <- bsplines(seq(1, 10, length = 15), df = 5, order = 3) +#' +#' # look at the structure of the basis +#' str(bmat) +#' +#' # change the order +#' str(update_bsplines(bmat, order = 4)) +#' +#' # change the order and the degrees of freedom +#' str(update_bsplines(bmat, df = 12, order = 4)) +#' +#' ########################### Updating a cpr_bt object ########################### +#' # construct a tensor product +#' tpmat <- btensor(list(x1 = seq(0, 1, length = 10), x2 = seq(0, 1, length = 10)), +#' df = list(4, 5)) +#' +#' tpmat +#' +#' # update the degrees of freedom +#' update_btensor(tpmat, df = list(6, 7)) +#' +#' ####### Updating bsplines or btensor on the right and side of a formula ######## +#' +#' f1 <- y ~ bsplines(x, df = 14) + var1 + var2 +#' f2 <- y ~ btensor(x = list(x1, x2), df = list(50, 31), order = list(3, 5)) + var1 + var2 +#' +#' update_bsplines(f1, df = 13, order = 5) +#' update_btensor(f2, df = list(13, 24), order = list(3, 8)) +#' +#' ########################### Updating a cpr_cp object ########################### +#' data(spdg, package = "cpr") +#' init_cp <- cp(pdg ~ bsplines(day, df = 30) + age + ttm, data = spdg) +#' updt_cp <- update_bsplines(init_cp, df = 5) +#' +#' ########################### Updating a cpr_cn object ########################### +#' init_cn <- cn(pdg ~ btensor(list(day, age), df = list(30, 4)) + ttm, data = spdg) +#' updt_cn <- update_btensor(init_cn, df = list(30, 2), order = list(3, 2)) +#' #' #' @export #' @rdname update_bsplines diff --git a/R/zzz.R b/R/zzz.R new file mode 100644 index 0000000..a7b599d --- /dev/null +++ b/R/zzz.R @@ -0,0 +1,3 @@ +.onUnload <- function(libpath) { + library.dynam.unload("cpr", libpath) +} diff --git a/README.md b/README.md index 83cbe5a..3f5aff4 100644 --- a/README.md +++ b/README.md @@ -12,14 +12,14 @@ An R package for implementing the Control Polygon Reduction model selection method. When we are tasked with modeling the functional relationship between a response and a continuous predictor, i.e., y = f(x), CPR allows for quick and efficient searching of a large model space to find B-spline estimates of the -function f(x). +function f(x). CPR extends to multiple dimensions and allows one to find good locations for knots in a tensor product of B-splines. ## Learn More About CPR. -This model selection method was developed as part of Peter DeWitt's PhD -dissertation work. +This model selection method was developed as part of [Peter DeWitt's PhD +dissertation](https://doi.org/10.25677/awnc-b795) work. ### Vignettes @@ -36,7 +36,7 @@ Additional vignettes may also be authored soon. * The CPR method was presented at the 28th International Biometric Conference held July 2016, in Victoria, British Columbia, Canada. The abstract, paper, - and talk had the title: + and talk had the title: "Parsimonious B-splines Regression Models via Control Polygon Reduction." A bibtex entry for the abstract: @@ -60,7 +60,7 @@ Additional vignettes may also be authored soon. (IBS). 2. "Distinguished Oral Presentation" as part of the student paper competition hosted by the Western North American Region - (WNAR) of the IBs. + (WNAR) of the IBS. ## Installing CPR Options for installing CPR: @@ -72,23 +72,22 @@ install.packages("cpr", repos = "https://cran.rstudio.com") ``` 2. Install the developmental version from github. This will require you to have - [devtools](https://github.com/hadley/devtools) installed, and, if you are + [remotes](https://cran.r-project.org/package=remotes) installed, and, if you are using Windows, you'll need [Rtools](https://cran.r-project.org/bin/windows/Rtools/) installed as well. ``` -library(devtools) -install_github("dewittpe/cpr", build_vignettes = TRUE) +remotes::install_github("dewittpe/cpr", build_vignettes = TRUE) ``` -3. Clone the repo and use `GNU make` +3. Clone the repository and use `GNU make` ```bash make install ``` 4. Go to the [release page](https://github.com/dewittpe/cpr/releases) and down - load the tar.gz file of the version you want to install. + load the `cpr_.tar.gz` file of the version you want to install. * Install from the command line @@ -110,7 +109,7 @@ error, or rather message, of the form: font family "sans" not found, using "bitmap" ``` Then there is an easy fix. You need to get the [FreeType 2 font -engine](https://www.freetype.org/). +engine](https://www.freetype.org/). On Debian, you can get the library via: diff --git a/cran-comments.md b/cran-comments.md new file mode 100644 index 0000000..fc91c51 --- /dev/null +++ b/cran-comments.md @@ -0,0 +1,35 @@ +# Version 0.3.0 +- Updating from v0.2.3 to v0.3.0 +- Initial submission 29 November 2023 + +## R CMD check results + +* Github Actions: + * macOS R-4.3.2 + * windows R-4.3.2 + * ubuntu R-4.3.2 + * ubuntu R-devel + * ubuntu R-4.2.3 + +* rhub + +* win-builder.r-project.org + +* Local (macOS Monterey 12.6) + * R 4.3.2 + +All pass with Status OK with one caveat: + +In some cases there was an error thrown as a result of lme4 being built againt +Matrix < v1.6-2 while other packages where built against Matrix >= 1.6-2. +Because lme4 is used in examples, and that was the cause of the error, I have +moved the examples that could error into dontrun sections for this release. + + +## Reverse dependencies + + None + +## Reverse suggests + + None diff --git a/examples/build_tensor.R b/examples/build_tensor.R deleted file mode 100644 index 964fbf2..0000000 --- a/examples/build_tensor.R +++ /dev/null @@ -1,13 +0,0 @@ -# see vignette("cpr-pkg", package = "cpr") for details on tensor products. - -A <- matrix(1:4, nrow = 10, ncol = 20) -B <- matrix(1:6, nrow = 10, ncol = 6) - -# Two ways of building the same tensor product -tensor1 <- build_tensor(A, B) -tensor2 <- build_tensor(list(A, B)) -all.equal(tensor1, tensor2) - -# a three matrix tensor product -tensor3 <- build_tensor(A, B, B) -str(tensor3) diff --git a/examples/cpr.R b/examples/cpr.R deleted file mode 100644 index d9bcc4d..0000000 --- a/examples/cpr.R +++ /dev/null @@ -1,68 +0,0 @@ -############################################################################### -# Example 1: find a model for log10(pdg) = f(day) from the spdg data set -\dontrun{ -# need the lme4 package to fit a mixed effect model -# library(lme4) - -# construct the initial control polygon. Forth order spline with fifty internal -# knots. Remember degrees of freedom equal the polynomial order plus number of -# internal knots. -init_cp <- cp(log10(pdg) ~ bsplines(day, df = 54) + (1|id), - data = spdg, method = lme4::lmer) -cpr_run <- cpr(init_cp) -plot(cpr_run) -plot(cpr_run, type = "rmse", to = 10) - -# preferable model is in index 4 -preferable_cp <- cpr_run[[4]] - -} - -############################################################################### -# Example 2: logistic regression -# simulate a binary response Pr(y = 1 | x) = p(x) -p <- function(x) { 0.65 * sin(x * 0.70) + 0.3 * cos(x * 4.2) } - -set.seed(42) -x <- runif(2500, 0.00, 4.5) -sim_data <- data.frame(x = x, y = rbinom(2500, 1, p(x))) - -# Define the initial control polygon -init_cp <- cp(formula = y ~ bsplines(x, df = 54, bknots = c(0, 4.5)), - family = binomial(), - method = glm, - data = sim_data) - -# run CPR, preferable model is in index 7 -cpr_run <- cpr(init_cp) -plot(cpr_run, color = TRUE, type = "rmse", to = 15) - -# plot the fitted spline and the true p(x) -sim_data$pred_select_p <- plogis(predict(cpr_run[[7]], newdata = sim_data)$pred) - -ggplot2::ggplot(sim_data) + -ggplot2::theme_bw() + -ggplot2::aes(x = x) + -ggplot2::geom_point(mapping = ggplot2::aes(y = y), alpha = 0.1) + -ggplot2::geom_line(mapping = ggplot2::aes(y = pred_select_p, color = "pred_select_p")) + -ggplot2::stat_function(fun = p, mapping = ggplot2::aes(color = 'p(x)')) - -# compare to gam and a binned average -sim_data$x2 <- round(sim_data$x, digits = 1) -bin_average <- - lapply(split(sim_data, sim_data$x2), function(x) { - data.frame(x = x$x2[1], y = mean(x$y)) - }) -bin_average <- do.call(rbind, bin_average) - -ggplot2::ggplot(sim_data) + -ggplot2::theme_bw() + -ggplot2::aes(x = x) + -ggplot2::geom_line(mapping = ggplot2::aes(y = pred_select_p, color = "pred_select_p")) + -ggplot2::stat_smooth(mapping = ggplot2::aes(y = y, color = "gam"), - method = "gam", - formula = y ~ s(x, bs = "cs"), - se = FALSE, - n = 1000) + -ggplot2::geom_line(data = bin_average, mapping = ggplot2::aes(y = y, color = "bin_average")) - diff --git a/examples/get_surface.R b/examples/get_surface.R deleted file mode 100644 index 4f67c2e..0000000 --- a/examples/get_surface.R +++ /dev/null @@ -1,24 +0,0 @@ -## Extract the control net and surface from a cpr_cn object. -a_cn <- cn(pdg ~ btensor(list(day, age), df = list(15, 3), order = list(3, 2)), - data = spdg) - -cn_and_surface <- get_surface(a_cn, n = 50) -str(cn_and_surface, max.level = 2) - -par(mfrow = c(1, 2)) -with(cn_and_surface$cn, - plot3D::persp3D(unique(Var1), - unique(Var2), - matrix(z, - nrow = length(unique(Var1)), - ncol = length(unique(Var2))), - main = "Control Net") - ) -with(cn_and_surface$surface, - plot3D::persp3D(unique(Var1), - unique(Var2), - matrix(z, - nrow = length(unique(Var1)), - ncol = length(unique(Var2))), - main = "Surface") - ) diff --git a/examples/update_bsplines.R b/examples/update_bsplines.R deleted file mode 100644 index 3bf2665..0000000 --- a/examples/update_bsplines.R +++ /dev/null @@ -1,40 +0,0 @@ -########################### Updating a cpr_bs object ########################### -# construct a B-spline basis -bmat <- bsplines(seq(1, 10, length = 15), df = 5, order = 3) - -# look at the structure of the basis -str(bmat) - -# change the order -str(update_bsplines(bmat, order = 4)) - -# change the order and the degrees of freedom -str(update_bsplines(bmat, df = 12, order = 4)) - -########################### Updating a cpr_bt object ########################### -# construct a tensor product -tpmat <- btensor(list(x1 = seq(0, 1, length = 10), x2 = seq(0, 1, length = 10)), - df = list(4, 5)) - -tpmat - -# update the degrees of freedom -update_btensor(tpmat, df = list(6, 7)) - -####### Updating bsplines or btensor on the right and side of a formula ######## - -f1 <- y ~ bsplines(x, df = 14) + var1 + var2 -f2 <- y ~ btensor(x = list(x1, x2), df = list(50, 31), order = list(3, 5)) + var1 + var2 - -update_bsplines(f1, df = 13, order = 5) -update_btensor(f2, df = list(13, 24), order = list(3, 8)) - -########################### Updating a cpr_cp object ########################### -data(spdg, package = "cpr") -init_cp <- cp(pdg ~ bsplines(day, df = 30) + age + ttm, data = spdg) -updt_cp <- update_bsplines(init_cp, df = 5) - -########################### Updating a cpr_cn object ########################### -init_cn <- cn(pdg ~ btensor(list(day, age), df = list(30, 4)) + ttm, data = spdg) -updt_cn <- update_btensor(init_cn, df = list(30, 2), order = list(3, 2)) - diff --git a/inst/WORDLIST b/inst/WORDLIST new file mode 100644 index 0000000..633d19d --- /dev/null +++ b/inst/WORDLIST @@ -0,0 +1,51 @@ +Bezier +Biometric +Boehm +boldsymbol +bibtex +bknots +bspline +btensor +collinear +cnr +CNR +Curtin +de Boor +devtools +DeWitt +DeWitt's +dplyr +DLT +FreeType +follicular +github +glucuronide +IBS +iknots +luteal +Luteal +OpenGL +Paluszny +Prautzsch +Pregnanediol +PDG +Rcpp +representable +rgl +Rmd +Rnw +RStudio +Rtools +Sanderson +Santoro +Springer +testthat +tibble +tidyr +univariable +wiggliness +WNAR +xlim +ylim +zlim +28th diff --git a/man/boehm.Rd b/man/boehm.Rd index 55e7565..6ea5a79 100644 --- a/man/boehm.Rd +++ b/man/boehm.Rd @@ -53,7 +53,7 @@ ordinate vector. The name comes from the the use of a hat matrix based on the in knot insertion matrix. Examples for the \code{refine_ordinate}, \code{coarsen_ordinate}, and -\code{hat_ordinate} are best shown in the vignette, +\code{hat_ordinate} are best shown in the vignette, \code{vignette("cpr-pkg", package = "cpr")}. \code{iknot_weights} returns a vector with the 'importance weight' of each diff --git a/man/build_tensor.Rd b/man/build_tensor.Rd index 6eebda8..2dcd568 100644 --- a/man/build_tensor.Rd +++ b/man/build_tensor.Rd @@ -18,7 +18,6 @@ A matrix Tensor products of Matrices. } \examples{ -# see vignette("cpr-pkg", package = "cpr") for details on tensor products. A <- matrix(1:4, nrow = 10, ncol = 20) B <- matrix(1:6, nrow = 10, ncol = 6) @@ -31,4 +30,8 @@ all.equal(tensor1, tensor2) # a three matrix tensor product tensor3 <- build_tensor(A, B, B) str(tensor3) + +} +\seealso{ +\code{vignette("cpr-pkg", package = "cpr")} for details on tensor products. } diff --git a/man/cnr_diagnostics.Rd b/man/cnr_diagnostics.Rd index 573cb57..6a0a45a 100644 --- a/man/cnr_diagnostics.Rd +++ b/man/cnr_diagnostics.Rd @@ -10,7 +10,7 @@ \item{x}{a \code{cpr_cnr} object} \item{type}{type of diagnostic plot. -\code{"loglik"} for the loglikihood by degrees of freedom, +\code{"loglik"} for the log likelihood by degrees of freedom, \code{"rmse"} for root mean squared residuals by model index} \item{from}{the first index of \code{x} to plot} diff --git a/man/cp.Rd b/man/cp.Rd index 5925c0b..c36f358 100644 --- a/man/cp.Rd +++ b/man/cp.Rd @@ -71,7 +71,7 @@ function?} respective control polygons.} \item{color}{Boolean (default FALSE) if more than one \code{cpr_cp} object is -to be plotted, set this value to TRUE to have the graphic in color (linetypes +to be plotted, set this value to TRUE to have the graphic in color (line types will be used regardless of the color setting).} \item{n}{the number of data points to use for plotting the spline} @@ -110,11 +110,11 @@ plot(cp1, cp2, show_spline = TRUE, color = TRUE) # via formula dat <- data.frame(x = xvec, y = sin((xvec - 2)/pi) + 1.4 * cos(xvec/pi)) -cp3 <- cp(y ~ cpr::bsplines(x) + 0, data = dat) +cp3 <- cp(y ~ cpr::bsplines(x), data = dat) # plot the control polygon, spline and target data. plot(cp3, show_spline = TRUE) + - ggplot2::geom_line(mapping = ggplot2::aes_string(x = "x", y = "y"), + ggplot2::geom_line(mapping = ggplot2::aes(x = x, y = y), data = dat, linetype = 2, color = "red") } diff --git a/man/cp_diagnostics.Rd b/man/cp_diagnostics.Rd index f219390..71089c3 100644 --- a/man/cp_diagnostics.Rd +++ b/man/cp_diagnostics.Rd @@ -13,7 +13,7 @@ cp_diff(cp1, cp2) \item{obj}{a cpr_cp object or \code{data.frame} where the first column is the abscissa and the second column is the ordinate for the control polygon vertices.} -\item{x}{absicissa at which to determine the ordinate on control polygon cp} +\item{x}{abscissa at which to determine the ordinate on control polygon cp} \item{cp1}{a cpr_cp object} @@ -21,7 +21,7 @@ abscissa and the second column is the ordinate for the control polygon vertices. } \value{ \code{cp_value} returns the ordinate on the control polygon line segment for -the abscissae \code{x} given. \code{x} could be a control vertex or on a +the abscissa \code{x} given. \code{x} could be a control vertex or on a line segment defined by two control vertices of the control polygon provided. diff --git a/man/cpr-package.Rd b/man/cpr-package.Rd index a3914b4..74209db 100644 --- a/man/cpr-package.Rd +++ b/man/cpr-package.Rd @@ -6,8 +6,12 @@ \title{cpr: Control Polygon Reduction} \description{ The cpr package implements the control polygon reduction and control net -reduction methods for finding parsimonious B-spline regression models. - -This package was part of Peter DeWitt's Ph.D. dissertation which was overseen -by his advisors Samantha MaWhinney and Nichole Carlson. +reduction methods for finding parsimonious B-spline regression models as +described in DeWitt (2017). +} +\references{ +DeWitt, Peter Edward. \dQuote{Parsimonious B-spline regression models via +control polygon and control net reduction for identifying factors explaining +variation in daily hormone profiles during the menopausal transition.} +(2017). } diff --git a/man/cpr.Rd b/man/cpr.Rd index bd9454e..a7e9ca9 100644 --- a/man/cpr.Rd +++ b/man/cpr.Rd @@ -41,27 +41,26 @@ stored in the first \code{keep + 1} (zero internal knots, one internal knot, limit on the number of stored regression fits is to keep memory usage down. } \examples{ -############################################################################### +############################################################################# # Example 1: find a model for log10(pdg) = f(day) from the spdg data set \dontrun{ # need the lme4 package to fit a mixed effect model -# library(lme4) +require(lme4) -# construct the initial control polygon. Forth order spline with fifty internal -# knots. Remember degrees of freedom equal the polynomial order plus number of -# internal knots. +# construct the initial control polygon. Forth order spline with fifty +# internal knots. Remember degrees of freedom equal the polynomial order +# plus number of internal knots. init_cp <- cp(log10(pdg) ~ bsplines(day, df = 54) + (1|id), data = spdg, method = lme4::lmer) cpr_run <- cpr(init_cp) -plot(cpr_run) +plot(cpr_run, color = TRUE) plot(cpr_run, type = "rmse", to = 10) # preferable model is in index 4 preferable_cp <- cpr_run[[4]] - } -############################################################################### +############################################################################# # Example 2: logistic regression # simulate a binary response Pr(y = 1 | x) = p(x) p <- function(x) { 0.65 * sin(x * 0.70) + 0.3 * cos(x * 4.2) } @@ -79,9 +78,11 @@ init_cp <- cp(formula = y ~ bsplines(x, df = 54, bknots = c(0, 4.5)), # run CPR, preferable model is in index 7 cpr_run <- cpr(init_cp) plot(cpr_run, color = TRUE, type = "rmse", to = 15) +plot(cpr_run, color = TRUE, from = 11, to = 15, show_spline = TRUE) # plot the fitted spline and the true p(x) -sim_data$pred_select_p <- plogis(predict(cpr_run[[7]], newdata = sim_data)$pred) +sim_data$pred_select_p <- + plogis(predict(cpr_run[[7]], newdata = sim_data)$pred) ggplot2::ggplot(sim_data) + ggplot2::theme_bw() + @@ -101,12 +102,15 @@ bin_average <- do.call(rbind, bin_average) ggplot2::ggplot(sim_data) + ggplot2::theme_bw() + ggplot2::aes(x = x) + +ggplot2::stat_function(fun = p, mapping = ggplot2::aes(color = 'p(x)')) + ggplot2::geom_line(mapping = ggplot2::aes(y = pred_select_p, color = "pred_select_p")) + ggplot2::stat_smooth(mapping = ggplot2::aes(y = y, color = "gam"), method = "gam", formula = y ~ s(x, bs = "cs"), se = FALSE, n = 1000) + -ggplot2::geom_line(data = bin_average, mapping = ggplot2::aes(y = y, color = "bin_average")) +ggplot2::geom_line(data = bin_average + , mapping = ggplot2::aes(y = y, color = "bin_average")) + } diff --git a/man/cpr_diagnostics.Rd b/man/cpr_diagnostics.Rd index 989e3fd..7f198a6 100644 --- a/man/cpr_diagnostics.Rd +++ b/man/cpr_diagnostics.Rd @@ -10,14 +10,14 @@ \item{x}{a \code{cpr_cpr} object} \item{type}{type of diagnostic plot. \code{"cps"} for control polygons, -\code{"loglik"} for the loglikihood by degrees of freedom, +\code{"loglik"} for the log likelihood by degrees of freedom, \code{"rmse"} for root mean squared residuals by degrees of freedom} \item{from}{the first index of \code{x} to plot} \item{to}{the last index of \code{x} to plot} -\item{...}{args passed to \code{cpr::plot.cpr_cp}} +\item{...}{arguments passed to \code{cpr::plot.cpr_cp}} } \description{ A collection of function for the inspection and evaluation of the control diff --git a/man/extract_cpr_bsplines.Rd b/man/extract_cpr_bsplines.Rd index cf5fd28..eebff19 100644 --- a/man/extract_cpr_bsplines.Rd +++ b/man/extract_cpr_bsplines.Rd @@ -10,5 +10,5 @@ extract_cpr_bsplines(form) \item{form}{a formula} } \description{ -Non-exported function. Might be depericated. +Non-exported function. Might be depreciated. } diff --git a/man/generate_cp_formula_data.Rd b/man/generate_cp_formula_data.Rd index 69a81a9..2c69fb0 100644 --- a/man/generate_cp_formula_data.Rd +++ b/man/generate_cp_formula_data.Rd @@ -28,7 +28,8 @@ continuous variable. If, however, \code{x2} is a factor, or coerced to a factor, then the model matrix will again be rank deficient as a column for all levels of the factor will be generated. We need to replace the intercept column of the model matrix with the bspline. This also needs to be done for -a variety of possible model calls, lm, lmer, etc. +a variety of possible model calls, \code{\link[stats]{lm}}, +\code{\link{lme4}{lmer}}, etc. By returning an explicit \code{formula} and \code{data.frame} for use in the fit, we hope to reduce memory use and increase the speed of the cpr method. diff --git a/man/get_surface.Rd b/man/get_surface.Rd index ed789cb..c3318ab 100644 --- a/man/get_surface.Rd +++ b/man/get_surface.Rd @@ -48,6 +48,7 @@ with(cn_and_surface$surface, ncol = length(unique(Var2))), main = "Surface") ) + } \seealso{ \code{\link{get_spline}} diff --git a/man/loglikelihood.Rd b/man/loglikelihood.Rd index bcc544f..93de424 100644 --- a/man/loglikelihood.Rd +++ b/man/loglikelihood.Rd @@ -9,27 +9,27 @@ loglikelihood(x, ...) \arguments{ \item{x}{a regression fit object} -\item{...}{passed through to \code{stats::logLik}} +\item{...}{passed through to \code{\link[stats]{logLik}}} } \value{ the numeric value of the (quasi) log likelihood. } \description{ -Return, via stats::logLik or a custom S3 method, the (quasi) log likelihood -of a regression object. +Return, via \code{\link[stats]{logLik}} or a custom S3 method, the (quasi) +log likelihood of a regression object. } \details{ This function is used by \code{cpr::cpr} and \code{cpr::cnr} to determine the (quasi) log likelihood returned in the \code{cpr_cpr} and \code{cpr_cnr} objects. -Generally this function defaults to stats::logLik. Therefore, if an S3 +Generally this function defaults to \code{\link[stats]{logLik}}. Therefore, if an S3 method for determining the (quasi) log likelihood exists in the workspace everything should work. If an S3 method does not exist you should define one. See \code{methods(loglikelihood)} for a list of the provided methods. The -default method uses \code{stats::logLik}. +default method uses \code{\link[stats]{logLik}}. } \seealso{ \code{\link{cpr}} \code{\link{cnr}} \code{\link[stats]{logLik}} diff --git a/man/matrix_rank.Rd b/man/matrix_rank.Rd index eb082ae..bb39556 100644 --- a/man/matrix_rank.Rd +++ b/man/matrix_rank.Rd @@ -16,15 +16,16 @@ the rank of the matrix as a numeric value. Determine the rank (number of linearly independent columns) of a matrix. } \details{ -Implimentation via the Armadillo C++ linear algrebra library. The function -returns the rank of the matix \code{x}. The computation is based on the -singular value decomposition of the matrix; a std::runtime_error excetion +Implementation via the Armadillo C++ linear algebra library. The function +returns the rank of the matrix \code{x}. The computation is based on the +singular value decomposition of the matrix; a std::runtime_error exception will be thrown if the decomposition fails. Any singular values less than -the tolerance are treated as zeros. The tolerance is max(m, n) * max_sv * -datum::eps, where m is the number of rows of x, n is the number of columns -of x, max_sv is the maximal singular value of x, and datum::eps is the -difference between 1 and the least value greater than 1 that is -representable. +the tolerance are treated as zeros. The tolerance is +\code{max(m, n) * max_sv * arma::datum::eps}, where \code{m} is the number +of rows of \code{x}, \code{n} is the number of columns of \code{x}, +\code{max_sv} is the maximal singular value of \code{x}, and +\code{arma::datum::eps} is the difference between 1 and the least value +greater than 1 that is representable. } \examples{ # Check the rank of a matrix diff --git a/man/plot.cpr_bs.Rd b/man/plot.cpr_bs.Rd index a9bd68d..6670650 100644 --- a/man/plot.cpr_bs.Rd +++ b/man/plot.cpr_bs.Rd @@ -32,6 +32,8 @@ plot(bmat, show_xi = TRUE, show_x = TRUE) plot(bmat, show_xi = FALSE, show_x = TRUE) plot(bmat, show_xi = TRUE, show_x = FALSE) ## Default plot(bmat, show_xi = FALSE, show_x = FALSE) +plot(bmat, show_xi = FALSE, show_x = FALSE) +plot(bmat, show_xi = FALSE, show_x = FALSE, color = FALSE) } \seealso{ \code{\link{bsplines}} diff --git a/man/plot.cpr_cn.Rd b/man/plot.cpr_cn.Rd index 67e45ab..93905c5 100644 --- a/man/plot.cpr_cn.Rd +++ b/man/plot.cpr_cn.Rd @@ -56,6 +56,8 @@ This plotting method generates three-dimensional plots of the control net, surface, or both, for a \code{cpr_cn} objects. The three-dimensional plots are generated by either \code{\link[plot3D]{persp3D}} form the \code{plot3D} package or \code{\link[rgl]{persp3d}} from the \code{rgl} package. +\code{rgl} graphics may or may not work on your system depending on support +for OpenGL. Building complex and customized graphics might be easier for you if you use \code{\link{get_surface}} to generate the needed data for plotting. See diff --git a/man/spdg.Rd b/man/spdg.Rd index cccfe1c..d65e791 100644 --- a/man/spdg.Rd +++ b/man/spdg.Rd @@ -3,7 +3,7 @@ \docType{data} \name{spdg} \alias{spdg} -\title{Simulated Prognanediol-glucuronid (PDG) Data} +\title{Simulated Pregnanediol glucuronide (PDG) Data} \format{ a \code{data.frame}. Variables in the data set: @@ -36,7 +36,7 @@ A Simulated data set based on the Study of Women's Health Across the Nation (SWAN) Daily Hormone Study (DHS). } \details{ -Prognanediol-glucuronid (PDG) is the urine metabolite of progesterone. This +Pregnanediol glucuronide (PDG) is the urine metabolite of progesterone. This data set was simulated to have similar characteristics to a subset of the SWAN DHS data. The SWAN DHS data was the motivating data set for the method development that lead to the \code{cpr} diff --git a/man/trimmed_quantile.Rd b/man/trimmed_quantile.Rd index a33022d..fd97fa9 100644 --- a/man/trimmed_quantile.Rd +++ b/man/trimmed_quantile.Rd @@ -17,9 +17,11 @@ values, if false, base the quantiles on all data, after trimming.} \item{...}{other arguments to pass to stats::quantile} } \description{ -For data X = x1, x2, ..., xn, with order statistics x(1), x(2), ..., x(r) return -the quantiles for a trimmed data set, e.g., X \ {x(1), x(r)} (trim = 1), or -X \ {x(1), x(2), x(r-1), x(r)} (trim = 2). +For data \eqn{X = x_1, x_2, \ldots, x_n}{X = x1, x2, ..., xn}, with order +statistics \eqn{x_{(1)}, x_{(2)}, \ldots, x_{(r)}}{x(1), x(2), ..., x(r)} +return the quantiles for a trimmed data set, e.g., +\eqn{\boldsymbol{X} \backslash \{x_{(1)}, x_{(r)}\}}{X \ {x(1), x(r)}} (trim = 1), or +\eqn{\boldsymbol{X} \backslash \{x_{(1)}, x_{(2)}, x_{(r-1)}, x_{(r)}\}}{X \ {x(1), x(2), x(r-1), x(r)}} (trim = 2). } \examples{ trimmed_quantile(1:100, prob = 1:23 / 24, name = FALSE) diff --git a/man/update_bsplines.Rd b/man/update_bsplines.Rd index 1beb3a3..392a3b9 100644 --- a/man/update_bsplines.Rd +++ b/man/update_bsplines.Rd @@ -62,6 +62,7 @@ updt_cp <- update_bsplines(init_cp, df = 5) init_cn <- cn(pdg ~ btensor(list(day, age), df = list(30, 4)) + ttm, data = spdg) updt_cn <- update_btensor(init_cn, df = list(30, 2), order = list(3, 2)) + } \seealso{ \code{\link[stats]{update}}, \code{\link{bsplines}}, diff --git a/src/Makevars b/src/Makevars index 2a4cd15..19d36c8 100644 --- a/src/Makevars +++ b/src/Makevars @@ -1,3 +1,2 @@ PKG_CXXFLAGS = -I../inst/include -DARMA_64BIT_WORD PKG_LIBS = $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) -CXX_STD = CXX11 diff --git a/src/boehm.cpp b/src/boehm.cpp index 469fbc7..e286056 100644 --- a/src/boehm.cpp +++ b/src/boehm.cpp @@ -22,15 +22,15 @@ */ double omega(double x, unsigned int j, const arma::vec& xi, unsigned int k = 4) { - if (x <= xi(j)) { + if (x <= xi(j)) { return(0); - } - else if (x >= xi(j + k - 1)) { + } + else if (x >= xi(j + k - 1)) { return(1); - } + } else { return((x - xi(j)) / (xi(j + k - 1) - xi(j))); - } + } } /* Function: @@ -44,38 +44,38 @@ double omega(double x, unsigned int j, const arma::vec& xi, unsigned int k = 4) * a matrix */ -arma::mat knot_insertion_matrix(double x, const arma::vec& xi, unsigned int k = 4) { +arma::mat knot_insertion_matrix(double x, const arma::vec& xi, unsigned int k = 4) { double w; int r = xi.n_elem - k; arma::mat kim(r + 1, r, arma::fill::zeros); kim(0, 0) = 1.0; kim(r, r - 1) = 1.0; - for (int i = 1; i < r; ++i) { + for (int i = 1; i < r; ++i) { w = omega(x, i, xi, k); kim(i, i - 1) = 1.0 - w; kim(i, i) = w; } - return(kim); + return(kim); } -// knot_coarsen_matrix: -arma::mat knot_coarsen_matrix(double x, const arma::vec& xi, unsigned int k = 4) { +// knot_coarsen_matrix: +arma::mat knot_coarsen_matrix(double x, const arma::vec& xi, unsigned int k = 4) { arma::mat kim = knot_insertion_matrix(x, xi, k); - + return((kim.t() * kim).i() * kim.t()); } // knot_insertion_hat_matrix is the "hat" matrix built from a W matrix -arma::mat knot_insertion_hat_matrix(double x, const arma::vec& xi, unsigned int k = 4) { +arma::mat knot_insertion_hat_matrix(double x, const arma::vec& xi, unsigned int k = 4) { arma::mat kim = knot_insertion_matrix(x, xi, k); - + return(kim * (kim.t() * kim).i() * kim.t()); } //' Knot Insertion, Removal, and Reinsertion -//' +//' //' Functions for the insertion, removal, and reinsertion of internal knots for //' B-splines. //' @@ -97,7 +97,7 @@ arma::mat knot_insertion_hat_matrix(double x, const arma::vec& xi, unsigned int //' in knot insertion matrix. //' //' Examples for the \code{refine_ordinate}, \code{coarsen_ordinate}, and -//' \code{hat_ordinate} are best shown in the vignette, +//' \code{hat_ordinate} are best shown in the vignette, //' \code{vignette("cpr-pkg", package = "cpr")}. //' //' \code{iknot_weights} returns a vector with the 'importance weight' of each @@ -122,32 +122,32 @@ arma::mat knot_insertion_hat_matrix(double x, const arma::vec& xi, unsigned int //' @export //' @rdname boehm // [[Rcpp::export]] -arma::vec refine_ordinate(double x, const arma::vec& xi, const arma::vec& theta, unsigned int order = 4) { +arma::vec refine_ordinate(double x, const arma::vec& xi, const arma::vec& theta, unsigned int order = 4) { return(knot_insertion_matrix(x, xi, order) * theta); } //' @export //' @rdname boehm // [[Rcpp::export]] -arma::vec coarsen_ordinate(double x, const arma::vec& xi, const arma::vec& theta, unsigned int order = 4) { +arma::vec coarsen_ordinate(double x, const arma::vec& xi, const arma::vec& theta, unsigned int order = 4) { return(knot_coarsen_matrix(x, xi, order) * theta); } //' @export //' @rdname boehm // [[Rcpp::export]] -arma::vec hat_ordinate(double x, const arma::vec& xi, const arma::vec& theta, unsigned int order = 4) { +arma::vec hat_ordinate(double x, const arma::vec& xi, const arma::vec& theta, unsigned int order = 4) { return(knot_insertion_hat_matrix(x, xi, order) * theta); } //' @export //' @rdname boehm // [[Rcpp::export]] -arma::mat insertion_matrix(double x, const arma::vec& xi, unsigned int order = 4) { +arma::mat insertion_matrix(double x, const arma::vec& xi, unsigned int order = 4) { return(knot_insertion_matrix(x, xi, order)); } // [[Rcpp::export]] Rcpp::NumericVector weigh_iknots(const arma::vec& xi, const arma::mat& theta, unsigned int order = 4, unsigned int p = 2) { - int iknots = xi.n_elem - 2 * order; + unsigned int iknots = xi.n_elem - 2 * order; arma::mat xi_mat(xi.n_elem - 1, iknots); arma::vec xi_to_insert = xi(arma::span(order, order + iknots - 1)); @@ -155,20 +155,21 @@ Rcpp::NumericVector weigh_iknots(const arma::vec& xi, const arma::mat& theta, un arma::mat w_mat(iknots, theta.n_cols); arma::vec w_vec(iknots); - int i,j,l; + unsigned int i,j; + int l; - for(j = 0; j < iknots; ++j) { - l = 0; - for(i = 0; i < xi_mat.n_rows; ++i) { + for(j = 0; j < iknots; ++j) { + l = 0; + for(i = 0; i < xi_mat.n_rows; ++i) { if (i == order + j) { ++l; - } + } xi_mat(i, j) = xi(i + l); } } - for(i = 0; i < w_mat.n_cols; ++i) { - for(j = 0; j < iknots; ++j) { + for(i = 0; i < w_mat.n_cols; ++i) { + for(j = 0; j < iknots; ++j) { w_mat(j, i) = arma::norm(theta.col(i) - knot_insertion_hat_matrix(xi_to_insert(j), xi_mat.col(j), order) * theta.col(i), p); } } diff --git a/src/cpr.cpp b/src/cpr.cpp index 322830f..d45def4 100644 --- a/src/cpr.cpp +++ b/src/cpr.cpp @@ -2,50 +2,13 @@ #include #include "cpr.h" -bspline::bspline(){} +/* ************************************************************************** */ +/* bbasis */ -bspline::bspline(arma::vec & x, unsigned int j_, unsigned int order_, arma::vec & knots_){ - j = j_; - order = order_; - knots = knots_; - spline.zeros(x.n_elem); - - for (unsigned int i = 0; i < x.n_elem; ++i) { - if (x(i) >= knots(j) && x(i) <= knots(j + order)) { - spline(i) = B(x(i), j, order); - } - } +bbasis::bbasis() { } -double bspline::w(double x, unsigned int j_, unsigned int k_) { - double w = 0.0; - if (knots(j_ + k_ - 1) != knots(j_)) { - w = (x - knots(j_)) / (knots(j_ + k_ - 1) - knots(j_)); - } - - return(w); -} - -double bspline::B(double x, unsigned int j_, unsigned int k_) { - double rtn; - - if (k_ == 1) { - if ((knots(j_) <= x) && (x < knots(j_ + 1))) { - rtn = 1.0; - } else { - rtn = 0.0; - } - } else { - rtn = w(x, j_, k_) * B(x, j_, k_ - 1) + (1.0 - w(x, j_ + 1, k_)) * B(x, j_ + 1, k_ - 1); - } - - return(rtn); -} - -bbasis::bbasis() { -} - -bbasis::bbasis(arma::vec & x, arma::vec & iknots_, arma::vec & bknots_, unsigned int order_) { +bbasis::bbasis(arma::vec & x, arma::vec & iknots_, arma::vec & bknots_, unsigned int order_) { order = order_; iknots = iknots_; bknots = bknots_; @@ -66,21 +29,67 @@ bbasis::bbasis(arma::vec & x, arma::vec & iknots_, arma::vec & bknots_, unsigned knots(order + i) = iknots(i); } - if (!knots.is_sorted()) { + if (!knots.is_sorted()) { Rcpp::warning("Sorting knots"); knots = arma::sort(knots); } - // define the basis matrix + // define the basis matrix for(j = 0; j < order + iknots.n_elem; ++j) { bmat.col(j) = bspline(x, j, order, knots).spline; } arma::uvec bx = arma::find(x == bknots(1)); arma::uvec jx(bx.n_elem); jx.fill(j - 1); - bmat(bx, jx).fill(1.0); + bmat(bx, jx).fill(1.0); } +/* ************************************************************************** */ +/* bspline */ + +bspline::bspline(){} + +bspline::bspline(arma::vec & x, unsigned int j_, unsigned int order_, arma::vec & knots_){ + j = j_; + order = order_; + knots = knots_; + spline.zeros(x.n_elem); + + for (unsigned int i = 0; i < x.n_elem; ++i) { + if (x(i) >= knots(j) && x(i) <= knots(j + order)) { + spline(i) = B(x(i), j, order); + } + } +} + +double bspline::w(double x, unsigned int j_, unsigned int k_) { + double w = 0.0; + if (knots(j_ + k_ - 1) != knots(j_)) { + w = (x - knots(j_)) / (knots(j_ + k_ - 1) - knots(j_)); + } + + return(w); +} + +double bspline::B(double x, unsigned int j_, unsigned int k_) { + double rtn; + + if (k_ == 1) { + if ((knots(j_) <= x) && (x < knots(j_ + 1))) { + rtn = 1.0; + } else { + rtn = 0.0; + } + } else { + rtn = w(x, j_, k_) * B(x, j_, k_ - 1) + (1.0 - w(x, j_ + 1, k_)) * B(x, j_ + 1, k_ - 1); + } + + return(rtn); +} + +/* ************************************************************************** */ +/* controlpolygon */ + controlpolygon::controlpolygon(bbasis & bmat_, arma::vec & theta_) { bmat = bmat_; theta = theta_; @@ -88,16 +97,24 @@ controlpolygon::controlpolygon(bbasis & bmat_, arma::vec & theta_) { xi_star = greville_sites(bmat.knots, bmat.order); } +/* ************************************************************************** */ +/* greville_sites */ arma::vec greville_sites(arma::vec & xi, unsigned int order) { arma::vec xi_star(xi.n_elem - order); - for (int i = 0; i < xi_star.n_elem; ++i) { + for (unsigned int i = 0; i < xi_star.n_elem; ++i) { xi_star(i) = arma::sum(xi(arma::span(i + 1, i + order - 1))) / double (order - 1); } return xi_star; } +/* ************************************************************************** */ +/* arma2vec */ Rcpp::NumericVector arma2vec(const arma::vec & x) { return Rcpp::NumericVector(x.begin(), x.end()); } + +/* ************************************************************************** */ +/* End of File */ +/* ************************************************************************** */ diff --git a/src/cpr.h b/src/cpr.h index 960c52d..c0893c0 100644 --- a/src/cpr.h +++ b/src/cpr.h @@ -5,22 +5,6 @@ #ifndef CPR_H #define CPR_H -struct bspline { - // member objects - arma::vec knots; // full knot sequence - unsigned int j; // jth spline - unsigned int order; // polynomial order - arma::vec spline; // the jth B-spline - - // constructors - bspline(); - bspline(arma::vec & x, unsigned int j_, unsigned int order_, arma::vec & knots_); - - // member methods, see de Boor (2001) page .... These - double B(double x, unsigned int j_, unsigned int k_); - double w(double x, unsigned int j_, unsigned int k_); -}; - struct bbasis { // member objects unsigned int order; // polynomial order @@ -33,15 +17,22 @@ struct bbasis { // constructors bbasis(); bbasis(arma::vec & x, arma::vec & iknots_, arma::vec & bknots_, unsigned int order_); +}; + +struct bspline { + // member objects + arma::vec knots; // full knot sequence + unsigned int j; // jth spline + unsigned int order; // polynomial order + arma::vec spline; // the jth B-spline + + // constructors + bspline(); + bspline(arma::vec & x, unsigned int j_, unsigned int order_, arma::vec & knots_); - // operators - // bbasis &operator =(const bbasis & b) { - // order = b.order; - // iknots = b.iknots; - // bknots = b.bknots; - // knots = b.knots; - // bmat = b.bmat; - // } + // member methods, see de Boor (2001) page .... These + double B(double x, unsigned int j_, unsigned int k_); + double w(double x, unsigned int j_, unsigned int k_); }; struct controlpolygon { @@ -54,7 +45,7 @@ struct controlpolygon { controlpolygon(bbasis & bmat_, arma::vec & theta_); }; -arma::vec greville_sites(arma::vec & xi, unsigned int order); +arma::vec greville_sites(arma::vec & xi, unsigned int order); Rcpp::NumericVector arma2vec(const arma::vec & x); #endif diff --git a/src/matrix_rank.cpp b/src/matrix_rank.cpp index 51a7478..7ffbacf 100644 --- a/src/matrix_rank.cpp +++ b/src/matrix_rank.cpp @@ -6,15 +6,16 @@ //' //' Determine the rank (number of linearly independent columns) of a matrix. //' -//' Implimentation via the Armadillo C++ linear algrebra library. The function -//' returns the rank of the matix \code{x}. The computation is based on the -//' singular value decomposition of the matrix; a std::runtime_error excetion +//' Implementation via the Armadillo C++ linear algebra library. The function +//' returns the rank of the matrix \code{x}. The computation is based on the +//' singular value decomposition of the matrix; a std::runtime_error exception //' will be thrown if the decomposition fails. Any singular values less than -//' the tolerance are treated as zeros. The tolerance is max(m, n) * max_sv * -//' datum::eps, where m is the number of rows of x, n is the number of columns -//' of x, max_sv is the maximal singular value of x, and datum::eps is the -//' difference between 1 and the least value greater than 1 that is -//' representable. +//' the tolerance are treated as zeros. The tolerance is +//' \code{max(m, n) * max_sv * arma::datum::eps}, where \code{m} is the number +//' of rows of \code{x}, \code{n} is the number of columns of \code{x}, +//' \code{max_sv} is the maximal singular value of \code{x}, and +//' \code{arma::datum::eps} is the difference between 1 and the least value +//' greater than 1 that is representable. //' //' @param x a numeric matrix //' diff --git a/tests/test-bsplines.R b/tests/test-bsplines.R index 4a0c44a..6251de6 100644 --- a/tests/test-bsplines.R +++ b/tests/test-bsplines.R @@ -8,12 +8,12 @@ bmat <- bsplines(1:100, df = 52) print_bmat <- capture.output(bmat) stopifnot(any(grepl("First\\s\\d+\\srows:", print_bmat))) -stopifnot(print_bmat[1] == "Basis matrix dims: [100 x 52]") -stopifnot(print_bmat[2] == "Order: 4") -stopifnot(print_bmat[3] == "Number of internal knots: 48") -stopifnot(print_bmat[4] == "") -stopifnot(print_bmat[5] == "First 6 rows:") -stopifnot(print_bmat[6] == "") +stopifnot(identical(print_bmat[1], "Basis matrix dims: [100 x 52]")) +stopifnot(identical(print_bmat[2], "Order: 4")) +stopifnot(identical(print_bmat[3], "Number of internal knots: 48")) +stopifnot(identical(print_bmat[4], "")) +stopifnot(identical(print_bmat[5], "First 6 rows:")) +stopifnot(identical(print_bmat[6], "")) print_bmat <- capture.output(print(bmat, n = 1000)) stopifnot(!any(grepl("First\\s\\d+\\srows:", print_bmat))) @@ -21,10 +21,13 @@ stopifnot(!any(grepl("First\\s\\d+\\srows:", print_bmat))) ################################################################################ # Equivalent Basis Matrix stopifnot( - all.equal( - current = unclass(bsplines(0:10, iknots = c(2, 2.6, 7.8), bknots = c(0, 10), order = 4)) - , target = unclass(splines::bs(0:10, knots = c(2, 2.6, 7.8), Boundary.knots = c(0, 10), intercept = TRUE)) - , check.attributes = FALSE) + isTRUE( + all.equal( + current = unclass(bsplines(0:10, iknots = c(2, 2.6, 7.8), bknots = c(0, 10), order = 4)) + , target = unclass(splines::bs(0:10, knots = c(2, 2.6, 7.8), Boundary.knots = c(0, 10), intercept = TRUE)) + , check.attributes = FALSE + ) + ) ) ################################################################################ @@ -33,23 +36,26 @@ xvec <- seq(-1, 1, length = 25) iknots <- c(0.34, -0.23) x <- tryCatch(bsplines(xvec, iknots = iknots), warning = function(w) {w}) + stopifnot("testing if warning is given when sorting knots" = - class(x) == c("simpleWarning", "warning", "condition")) -stopifnot("test warning message" = x$message == "Sorting knots") + identical(class(x), c("simpleWarning", "warning", "condition")) + ) + +stopifnot("test warning message" = identical(x$message, "Sorting knots")) -stopifnot("expected bases generated after sorting knots" = +stopifnot("expected bases generated after sorting knots" = isTRUE( all.equal( current = unclass(suppressWarnings(bsplines(xvec, iknots = c(.34, -.23)))) , target = unclass(splines::bs(xvec, knots = iknots, intercept = TRUE)) , check.attributes = FALSE ) + ) ) ################################################################################ x <- tryCatch(bsplines(list(1:10)), error = function(e) { e }) -stopifnot(class(x) == c("simpleError", "error", "condition")) -stopifnot("error if list passed to bsplines" = - x$message == "x is a list. use cpr::btensor instead of cpr::bsplines.") +stopifnot(identical(class(x), c("simpleError", "error", "condition"))) +stopifnot("error if list passed to bsplines" = identical(x$message, "x is a list. use cpr::btensor instead of cpr::bsplines.")) ################################################################################ @@ -58,31 +64,30 @@ stopifnot("error if list passed to bsplines" = xvec <- seq(-1, 1, length = 25) x <- tryCatch(bsplines(xvec, df = 2), warning = function(w) {w}) -stopifnot(class(x) == c("simpleWarning", "warning", "condition")) -stopifnot(x$message == "df being set to order") - +stopifnot(identical(class(x), c("simpleWarning", "warning", "condition"))) +stopifnot(identical(x$message, "df being set to order")) x <- suppressWarnings(bsplines(xvec, df = 2)) y <- bsplines(xvec, df = 4) -stopifnot(all.equal(x, y, check.attributes = FALSE)) +stopifnot(isTRUE(all.equal(x, y, check.attributes = FALSE))) x <- suppressWarnings(bsplines(xvec, df = 4, order = 5)) y <- bsplines(xvec, df = 5, order = 5) z <- bsplines(xvec, order = 5) -stopifnot(all.equal(x, y, check.attributes = FALSE)) -stopifnot(all.equal(x, z, check.attributes = FALSE)) +stopifnot(isTRUE(all.equal(x, y, check.attributes = FALSE))) +stopifnot(isTRUE(all.equal(x, z, check.attributes = FALSE))) x <- tryCatch(bsplines(xvec, iknots = 0.2, df = 6), warning = function(w) {w}) -stopifnot(class(x) == c("simpleWarning", "warning", "condition")) -stopifnot(x$message == "Both iknots and df defined, using iknots") +stopifnot(identical(class(x), c("simpleWarning", "warning", "condition"))) +stopifnot(identical(x$message, "Both iknots and df defined, using iknots")) x <- suppressWarnings(bsplines(xvec, iknots = 0.2, df = 6)) y <- bsplines(xvec, iknots = 0.2) -stopifnot(all.equal(x, y, check.attributes = FALSE)) +stopifnot(isTRUE(all.equal(x, y, check.attributes = FALSE))) x <- bsplines(xvec, df = 7) y <- bsplines(xvec, iknots = trimmed_quantile(xvec, probs = 1:3/4)) -stopifnot(all.equal(x, y, check.attributes = FALSE)) +stopifnot(isTRUE(all.equal(x, y, check.attributes = FALSE))) ################################################################################ @@ -98,15 +103,18 @@ baseD2 <- spline.des(xi, xvec, derivs = rep(2, length(xvec)))$design cprD1 <- bsplineD(xvec, iknots = iknots) cprD2 <- bsplineD(xvec, iknots = iknots, derivative = 2L) -all.equal(cprD1[-100, ], baseD1[-100, ]) -all.equal(cprD2[-100, ], baseD2[-100, ]) +stopifnot(isTRUE( all.equal(cprD1[-100, ], baseD1[-100, ]))) +stopifnot(isTRUE( all.equal(cprD2[-100, ], baseD2[-100, ]))) x <- tryCatch(bsplineD(xvec, derivative = 1.5), error = function(e) {e}) -stopifnot(class(x) == c("simpleError", "error", "condition")) -stopifnot(x$message == "Only first and second derivatives are supported") +stopifnot(identical(class(x), c("simpleError", "error", "condition"))) +stopifnot(identical(x$message, "Only first and second derivatives are supported")) rm(x) x <- tryCatch(bsplineD(xvec, derivative = 3), error = function(e) {e}) -stopifnot(class(x) == c("simpleError", "error", "condition")) -stopifnot(x$message == "Only first and second derivatives are supported") +stopifnot(identical(class(x), c("simpleError", "error", "condition"))) +stopifnot(identical(x$message, "Only first and second derivatives are supported")) +################################################################################ +## End of File ## +################################################################################ diff --git a/tests/test-btensor.R b/tests/test-btensor.R index e61f766..87e7c87 100644 --- a/tests/test-btensor.R +++ b/tests/test-btensor.R @@ -3,16 +3,16 @@ library(cpr) test_warning <- function(x = NULL, msg = NULL) { stopifnot(!is.null(x)) stopifnot(!is.null(msg)) - stopifnot(class(x) == c("simpleWarning", "warning", "condition")) - stopifnot(x$message == msg) + stopifnot(identical(class(x), c("simpleWarning", "warning", "condition"))) + stopifnot(identical(x$message, msg)) invisible(TRUE) } test_error <- function(x, msg = NULL) { stopifnot(!is.null(x)) stopifnot(!is.null(msg)) - stopifnot(class(x) == c("simpleError", "error", "condition")) - stopifnot(x$message == msg) + stopifnot(identical(class(x), c("simpleError", "error", "condition"))) + stopifnot(identical(x$message, msg)) invisible(TRUE) } @@ -32,7 +32,7 @@ mm <- splines::bs(mtcars$hp, knots = c(100, 150), intercept = TRUE) ) -stopifnot(all.equal(mm, unclass(bm), check.attributes = FALSE)) +stopifnot(isTRUE(all.equal(mm, unclass(bm), check.attributes = FALSE))) ################################################################################ @@ -48,7 +48,7 @@ mm <- splines::bs(mtcars$hp, knots = c(100, 150), intercept = TRUE) ) -stopifnot(all.equal(mm, unclass(bm), check.attributes = FALSE)) +stopifnot(isTRUE(all.equal(mm, unclass(bm), check.attributes = FALSE))) ################################################################################ # 3D btensor matrix is constructed as expected @@ -63,7 +63,7 @@ mm <- splines::bs(mtcars$hp, knots = c(100, 150), intercept = TRUE) : splines::bs(mtcars$mpg, knots = c(12.2, 16.3, 21.9), intercept = TRUE)) -stopifnot(all.equal(mm, unclass(bm), check.attributes = FALSE)) +stopifnot(isTRUE(all.equal(mm, unclass(bm), check.attributes = FALSE))) ################################################################################ @@ -73,11 +73,12 @@ bm <- btensor(x = list(mtcars$disp, mtcars$hp, mtcars$mpg), iknots = list(numeric(0), c(100, 150), c(12.2, 16.3, 21.9))) -stopifnot( +stopifnot(isTRUE( all.equal( lapply(attr(bm, "bspline_list"), attr, which = "bknots"), lapply(list(mtcars$disp, mtcars$hp, mtcars$mpg), range) ) + ) ) x <- @@ -95,3 +96,5 @@ x <- tryCatch(btensor(x = list(mtcars$disp, mtcars$hp, mtcars$mpg), bknots = lis test_error(x, "length(bknots) == length(x) is not TRUE") ################################################################################ +## End of File ## +################################################################################ diff --git a/tests/test-influence_of.R b/tests/test-influence_of.R index 6f11b81..885fff8 100644 --- a/tests/test-influence_of.R +++ b/tests/test-influence_of.R @@ -6,8 +6,8 @@ bmat <- bsplines(x = seq(0, 6, length = 500), iknots = c(1.0, 1.5, 2.3, 4.0, 4.5)) theta <- c(1, 0, 3.5, 4.2, 3.7, -0.5, -0.7, 2, 1.5) omit_xi <- influence_of(cp(bmat, theta), c(6, 8)) -stopifnot(all.equal(omit_xi$weight$w[1], 0.5391355923)) #sprintf("%.10f", omit_xi$weight$w[1]) -stopifnot(all.equal(omit_xi$weight$w[2], 0.2775424520)) +stopifnot(isTRUE(all.equal(omit_xi$weight$w[1], 0.5391355923))) #sprintf("%.10f", omit_xi$weight$w[1]) +stopifnot(isTRUE(all.equal(omit_xi$weight$w[2], 0.2775424520))) ################################################################################ # test that influence weights for control polygon is as expected @@ -16,7 +16,7 @@ bmat <- bsplines(x = seq(0, 6, length = 500), theta <- c(1, 0, 3.5, 4.2, 3.7, -0.5, -0.7, 2, 1.5) iw <- influence_weights(cp(bmat, theta)) expected <- structure(list(iknots = c(1, 1.5, 2.3, 4, 4.5), w = c(1.28320371711094, 0.539135592305278, 0.558614618870046, 0.277542452002693, 0.647979474076723)), class = "data.frame", row.names = c(NA, -5L)) -stopifnot(all.equal(iw, expected)) +stopifnot(isTRUE(all.equal(iw, expected))) ################################################################################ # test that influence_weights for control net are as expected @@ -50,6 +50,9 @@ expected <- ) ) -stopifnot(all.equal(iw, expected)) +stopifnot(isTRUE(all.equal(iw, expected))) +################################################################################ +## End of File ## +################################################################################ diff --git a/tests/test-knot_expr.R b/tests/test-knot_expr.R index d3bdaca..3248983 100644 --- a/tests/test-knot_expr.R +++ b/tests/test-knot_expr.R @@ -4,5 +4,8 @@ library(cpr) bmat <- bsplines(mtcars$hp, df = 8) ke <- cpr:::knot_expr.cpr_bs(bmat, digits = 1) -stopifnot(all.equal(ke$xi_expr[[4]], bquote(xi[7]))) +stopifnot(isTRUE(all.equal(ke$xi_expr[[4]], bquote(xi[7])))) +################################################################################ +## End of File ## +################################################################################ diff --git a/tests/test-matrix_rank.R b/tests/test-matrix_rank.R index a0c2044..2e82c23 100644 --- a/tests/test-matrix_rank.R +++ b/tests/test-matrix_rank.R @@ -16,3 +16,7 @@ stopifnot(identical(matrix_rank(bmat), 15)) bmat <- bsplines(seq(0, 1, length = 100), iknots = c(0.001, 0.002)) stopifnot(identical(ncol(bmat), 6L)) stopifnot(identical(matrix_rank(bmat), 5)) + +################################################################################ +## End of File ## +################################################################################ diff --git a/tests/test-recover-know-spline.R b/tests/test-recover-know-spline.R index 3082200..63d084d 100644 --- a/tests/test-recover-know-spline.R +++ b/tests/test-recover-know-spline.R @@ -53,3 +53,6 @@ recover_spline <- function(k = 4L, start_with = 100L, seed, theta_dist_sd = 100, set.seed(42) stopifnot(recover_spline(start_with = 40L, progress = FALSE)$recovered) +################################################################################ +## End of File ## +################################################################################ diff --git a/tests/test-trimmed_quantile.R b/tests/test-trimmed_quantile.R index 1dc8c62..00fda04 100644 --- a/tests/test-trimmed_quantile.R +++ b/tests/test-trimmed_quantile.R @@ -12,11 +12,12 @@ e <- try(trimmed_quantile(1:100, trim = -3.9, prob = 1:23 / 24, name = FALSE), s stopifnot(inherits(e, "try-error")) stopifnot(attr(e, "condition")$message == "(converted from warning) Overruling trim less than 1 with trim = 1L") -stopifnot( +stopifnot(isTRUE( all.equal(suppressWarnings(trimmed_quantile(1:100, trim = 3.9, prob = 1:23 / 24, name = FALSE)) , quantile(4:97, prob = 1:23 / 24) , check.attributes = FALSE ) + ) ) ################################################################################ @@ -27,6 +28,9 @@ ux <- unique(x) xt <- x[!(x %in% range(x))] uxt <- ux[!(ux %in% range(ux))]#-c(which.min(ux), which.max(ux))] -stopifnot(all.equal(trimmed_quantile(x), quantile(uxt))) -stopifnot(all.equal(trimmed_quantile(x, use_unique = FALSE), quantile(xt))) +stopifnot(isTRUE(all.equal(trimmed_quantile(x), quantile(uxt)))) +stopifnot(isTRUE(all.equal(trimmed_quantile(x, use_unique = FALSE), quantile(xt)))) +################################################################################ +## End of File ## +################################################################################ diff --git a/vignettes/bsplines.Rnw b/vignettes/bsplines.Rnw deleted file mode 100644 index 05f32a7..0000000 --- a/vignettes/bsplines.Rnw +++ /dev/null @@ -1,417 +0,0 @@ -%\VignetteEngine{knitr::knitr} -%\VignetteIndexEntry{B-Splines and Control Polygons} -%\VignetteEncoding{UTF-8} - -% ---------------------------------------------------------------------------- % -\documentclass[10pt]{article} -\usepackage{fullpage} -\usepackage[T1]{fontenc} -\usepackage[utf8]{inputenc} -\usepackage{alltt} - -% ---------------------------------------------------------------------------- % -% Float setup -\usepackage{ctable} -\usepackage{floatrow} -\floatsetup[table]{framestyle=colorbox,framefit=yes,heightadjust=all,framearound=all,capposition=top} -\floatsetup[figure]{framestyle=colorbox,framefit=yes,heightadjust=all,framearound=all,capposition=bottom} -\usepackage{subcaption} - -% ---------------------------------------------------------------------------- % -% JSS style commands: \proglang, \pkg, and \code -\makeatletter -\@ifundefined{proglang}{% - \let\proglang=\textsf - \newcommand{\pkg}[1]{{\fontseries{b}\selectfont #1}} - \newcommand\code{\bgroup\@makeother\_\@makeother\~\@makeother\$\@codex} - \def\@codex#1{{\normalfont\ttfamily\hyphenchar\font=-1 #1}\egroup} -} -\makeatother - -% ---------------------------------------------------------------------------- % -% Math Stuff -\usepackage{amsmath,amssymb,amsfonts,mathrsfs} - -% ---------------------------------------------------------------------------- % -% new commands -\newcommand{\ie}{{\itshape i.e.}} -\newcommand{\eg}{{\itshape e.g.}} -\newcommand{\bs}[1]{\boldsymbol{#1}} -\newcommand{\card}[1]{\left|#1\right|} -\newcommand{\CP}{C\!P} % control polygon, or control point sequence -\newcommand{\CN}{C\!N} % control net -\DeclareMathOperator*{\argmin}{arg\,min} -\DeclareMathOperator*{\argmax}{arg\,max} - -% ---------------------------------------------------------------------------- % -% Bibtex -\usepackage{filecontents} -\usepackage[authoryear]{natbib} - - -% ---------------------------------------------------------------------------- % -% Front Matter -\title{B-Splines and Control Polygons} -\author{Peter DeWitt} - -% ---------------------------------------------------------------------------- % -% Begin the document -\begin{document} -\maketitle -\abstract{% - This vignette covers general theory and mathematics of B-Splines and their - corresponding control polygons. The use of the \pkg{cpr} package to build and - work with B-Splines and control polygons is presented to show examples of the - general concepts. Additional details about the control polygon reduction - approach for finding a parsimonious set of knots for B-spline regression model - are presented in \code{vignette(topic = "cpr-pkg", package = "cpr")}. - -<>= -# knitr::opts_chunk$set() -hook_source = knitr::knit_hooks$get('source') -knitr::knit_hooks$set(source = function(x, options) { - txt = hook_source(x, options) - # extend the default source hook - gsub('\\\\textbackslash\\{\\}ref\\\\\\{(.+)\\\\\\}', '\\\\ref{\\1}', txt) -}) -@ - -<<>>= -set.seed(42) -library(cpr) -@ -} - -\section{B-Splines} -\subsection{General Definition} -Many references are available for B-splines. Two very good references are -\citet{deboor2001} and \citet{prautzsch2002}. The goal of this document is to -provide a detailed exploration of B-splines in the context useful for -statistical models. - -B-splines get there name from `basis'-splines. That this, a set of basis -functions define a B-spline basis and space. A B-spline basis, $\bs{B}_{k, -\bs{\xi}} \left(\bs{x}\right),$ is defined by a polynomial -order\footnote{Recall: the order of a polynomial is the degree plus 1, \eg, -$x^n$ is a $n^{th}$ degree, or $(n+1)^{st}$ order, polynomial.} -$k$ and knot sequence $\bs{\xi}$ over the range of a numeric vector $\bs{x}.$ -A traditional knot sequence construction places $k$ fold boundary knots at -$\min\left(\bs{x}\right)$ and $\max\left(\bs{x}\right),$ with an additional $l -\geq 0$ internal knots located within $\text{range}\left(\bs{x}\right).$ -The basis matrix, $\bs{B}_{k, \bs{\xi}} \left( x\right) \in -\mathbb{R}^{\card{\bs{x}} \times \left(k + l\right)}$ is of the form -\begin{equation} - \label{eq:basis_matrix} - \bs{B}_{k, \bs{\xi}} \left(x \right) = - \begin{pmatrix} - B_{1, k, \bs{\xi}}\left(x_1\right) & B_{2, k, \bs{\xi}}\left(x_1\right) & \cdots & B_{k+l, k, \bs{\xi}}\left(x_1\right) \\ - B_{1, k, \bs{\xi}}\left(x_2\right) & B_{2, k, \bs{\xi}}\left(x_2\right) & \cdots & B_{k+l, k, \bs{\xi}}\left(x_2\right) \\ - \vdots & \vdots & \ddots & \vdots \\ - B_{1, k, \bs{\xi}}\left(x_{\card{\bs{x}}}\right) & B_{2, k, \bs{\xi}}\left(x_{\card{\bs{x}}}\right) & \cdots & B_{k+l, k, \bs{\xi}}\left(x_{\card{\bs{x}}}\right) \\ - \end{pmatrix} -\end{equation} -where $B_{j, k, \bs{\xi}} \left( \bs{x} \right)$ is the $j^{th}$ B-spline defined by de Boor's recursive algorithm -% -\begin{equation} - \label{eq:deboor_recursion} - B_{j, k, \bs{\xi}} \left(x\right) = - \omega_{j, k, \bs{\xi}}\left(x\right) B_{j, k-1, \bs{\xi}}\left(x\right) + \left(1 - - \omega_{j+1, k, \bs{\xi}} \right) B_{j+1, k-1, \bs{\xi}}\left(x\right), -\end{equation} -with -\begin{equation} - \label{eq:recurrence_relation_null} - B_{j, k, \bs{\xi}}\left(x\right) = 0 \quad \text{for } x \notin - \left[\xi_{j}, \xi_{j+k}\right), \quad - B_{j, 1, \bs{\xi}}\left(x\right) = - \begin{cases} - 1 & x \in \left[\xi_{j}, \xi_{j+1} \right)\\ - 0 & \text{otherwise}, - \end{cases} -\end{equation} -and -\begin{equation} - \label{eq:omega} - \omega_{jk}\left(x\right) = - \begin{cases} - \hfil 0 & \hfil x \leq \xi_{j} \\ - \frac{x - \xi_{j}} {\xi_{j + k - 1} - \xi_{j}} & \xi_{j} < x < \xi_{j+ k - 1} \\ - \hfil 1 & \hfil \xi_{j + k - 1} \leq x - \end{cases}. -\end{equation} -The basis is a partition of unity over the support of $\bs{x},$ -\ie, all rows sum to one, and the span of the basis defines a spline -space $\mathbb{S}_{k, \bs{\xi}} = \text{span } \bs{B}_{k, \bs{\xi}}.$ - -\subsection{Building B-spline basis} -Build B-spline basis via \code{bsplines}. -<<>>= -str(bsplines) -@ -<>= -stopifnot(length(formals(bsplines)) == 5L) -# if the above fails verify that the description of the arguments below are -# correct. -@ -\noindent There are \Sexpr{length(formals(bsplines))} arguments for the -\code{bsplines} call. -\begin{itemize} - \item \code{x} is a numeric vector as noted in the above notation. - \item \code{iknots} is a numeric vector of locations for the internal knots. - \item \code{df} an integer value for the number of degrees of freedom. - \code{df} will be equal to the \code{order + length(iknots)}. - \item \code{bknots} boundary knots. This needs only be a numeric vector of - length two. The \code{bsplines} call will generate the full knot sequence - $\bs{\xi}$ with the proper multiplicity of the boundary knots. - \item \code{order} the polynomial order. By default \code{order = 4L} - generates fourth-order B-splines, also called cubic B-splines as polynomial - order the polynomial degree plus one. -\end{itemize} -You only need to specify \code{iknots} or \code{df}. If both are provided a -warning will be given and \code{iknots} will be used in the construction of the -basis matrix. - -<>= -x <- seq(0, 6, length = 500) # will be used again -bmat <- bsplines(x = x, - iknots = c(1.0, 1.5, 2.3, 4.0, 4.5)) -bmat -@ - -\subsection{cpr::bsplines vs splines::bs} -The standard installation of \proglang{R} includes the -\pkg{splines} package and the \code{splines::bs} function for -generating the basis matrix of B-splines, \ie, the matrix shown in -\eqref{eq:basis_matrix}. The provided \pkg{cpr::bsplines} function -for generating B-spline basis matrices with the class -\code{cpr\_bs} has arguments and defaults which are beneficial, and required, -for the use in the control polygon reduction algorithm. -The differences in the functional arguments and the attributes -of the return objects between \code{splines::bs} and \code{cpr::bsplines} are -listed in Table~\ref{tab:bs_vs_bsplines}. -\begin{table} - \caption{Comparison of the arguments, with default values, for \code{splines::bs} and - \code{cpr::bsplines}. The attributes for the resulting \code{splines::bs} and - \code{cpr\_bs} objects are also reported.} - \label{tab:bs_vs_bsplines} - \centering -<>= -stopifnot( - names(attributes(bmat)) == - c("dim", "order", "iknots", "bknots", "xi", "xi_star", "class", "call", "environment")) -@ - -\begin{tabular}{lll} \hline - & \code{splines::bs} & \code{cpr::bsplines} \\ \hline - Arguments & & \\ - & \code{x} & \code{x} \\ - & \code{df} & \code{df} \\ - & \code{knots} & \code{iknots} \\ - & \code{degree = 3} & \code{order = 4L} \\ - & \code{Boundary.knots = range(x)} & \code{bknots = range(x)} \\ - & \code{intercept = FALSE} & -- \\ \hline -Attributes & & \\ - & \code{dim} & \code{dim} \\ - & \code{degree} & \code{order} \\ - & \code{knots} & \code{iknots} \\ - & \code{Boundary.knots} & \code{bknots} \\ - & \code{intercept} & -- \\ - & -- & \code{xi} \\ - & -- & \code{xi\_star} \\ - & \code{class} & \code{class} \\ - & -- & \code{call} \\ - & -- & \code{environment} \\ \hline -\end{tabular} -\end{table} - -A major difference between the two functions is related to the \code{intercept} -argument of \code{splines::bs}. By default, \code{splines::bs} will omit the -first column of the basis whereas \code{cpr::bsplines} will return the whole -basis. The omission of the first column of the basis generated by -\code{splines::bs} allows for additive \code{splines::bs} calls to be used on -the right-hand-side of a regression formula and generate a full rank design -matrix. If additive \code{cpr::bsplines} calls, or additive \code{splines::bs} -with \code{intercept = TRUE}, are on the right-hand-side of the regression -equation the resulting design matrix will be rank deficient. This is a result -of the B-splines being a partition of unity. As the CPR algorithm is based on -having the whole basis, {he \code{cpr::bsplines} function is provided to make it -easy to work with the whole basis without having to remember to use non-default -settings in \code{splines::bs}. The default call \code{splines::bs(x)} is -replicated by \code{cpr::bsplines(x)[, -1]} and the default call -\code{cpr::bsplines(x)} is replicated \code{splines::bs(x, intercept = TRUE)}. - -Specifying the polynomial order and knot sequence between the two functions -differ between \code{splines::bs} and \code{cpr::bsplines}. \code{splines::bs} -uses the polynomial \code{degree} whereas \code{cpr::bsplines} uses the -polynomial \code{order} (order = degree + 1) to define the splines. The default -for both \code{splines::bs} and \code{cpr::bsplines} is to generate cubic -B-splines. - -The call to \code{splines::bs} which will generate the same basis matrix as -\code{bmat} above is: -<<>>= -bmat_bs <- splines::bs(x = seq(0, 6, length = 500), - knots = c(1.0, 1.5, 2.3, 4.0, 4.5), - intercept = TRUE) -all.equal(unclass(bmat), unclass(bmat_bs), check.attributes = FALSE) -@ - -<>= -stopifnot(all.equal(unclass(bmat), unclass(bmat_bs), check.attributes = FALSE)) -@ - -For both \code{splines::bs} and \code{cpr::bsplines} only the degrees of freedom -or the internal knots need to be specified. If the end user specifies both, the -specified knots take precedence. If only \code{df} is specified then \code{df - -order} internal knots will be generated. - -There is a very important difference in way \code{splines::bs} and -\code{cpr::bsplines} build the sequence of internal knots. \code{splines::bs} -uses quantiles of the whole numeric vector $\bs{x}$ whereas \code{cpr::bsplines} -uses quantiles of the unique values of $\bs{x} \backslash \{\min(\bs{x}), -\max(\bs{x}) \}.$ -For example: the internal knots for a B-spline with 9 degrees of freedom on a -forth order spline with an intercept, \code{splines::bs} will generate a -sequence of internal knots via a call equivalent to -<<>>= -y <- sample(runif(100, 0, 6), 500, replace = TRUE) -# k <- degrees of freedom - order + (1L - intercept) -degf <- 9 -ordr <- 4 -k <- degf - ordr + (1 - as.integer(TRUE)) -p <- seq(0, 1, length = k + 2L)[-c(1, k + 2L)] -stats::quantile(y, probs = p) -@ -\noindent whereas \code{cpr::bsplines} will generate a sequence equivalent to -<<>>= -stats::quantile(unique(y)[-c(1, length(unique(y)))], - probs = seq(1, degf - ordr, by = 1) / (degf - ordr + 1)) -@ -\noindent The function \code{cpr::trimmed_quantile} is provided to generate such -sequences. -<<>>= -trimmed_quantile(y, probs = seq(1, degf - ordr, by = 1) / (degf - ordr + 1)) -@ - -The return object from both \code{splines::bs} and \code{cpr::bsplines} is a -matrix. The attributes returned include the argument values used to construct -the basis. The major difference in the attributes between the two objects is -that \code{cpr::bsplines} returns the full knot sequence, $\bs{\xi},$ in the -\code{xi} element and the Greville sites, $\bs{\xi}^{*},$ in the \code{xi\_star} -element which is useful for building control polygons. Lastly, the classes for -the two objects differ: \code{splines::bs} returns a three classes, -\code{c("bs", "basis", "matrix")} and \code{cpr::bsplines} returns two classes, -\code{c("cpr\_bs", "matrix")}. - -\subsection{Plotting B-spline Basis} -The \pkg{cpr} package provides an plotting -method, built on top of \pkg{ggplot2}~\citep{R-ggplot2}, -for \code{cpr\_bs} objects. -<