diff --git a/DESCRIPTION b/DESCRIPTION index 4a91f36..9a2c671 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -14,7 +14,10 @@ Description: Anscombe's quartet are a set of four two-variable datasets that dataset, which is shifted, scaled and rotated in order to achieve target summary statistics. The general shape of the initial dataset is retained. The target statistics can be supplied directly or calculated based on a - user-supplied dataset. + user-supplied dataset. The 'datasauRus' package + provides further examples + of datasets that have markedly different scatter plots but share many + sample summary statistics. Imports: datasets, graphics, diff --git a/cran-comments.md b/cran-comments.md index b16048c..2384954 100644 --- a/cran-comments.md +++ b/cran-comments.md @@ -1,3 +1,9 @@ +This is a resubmission. + +I have reset par() at the end of the vignette because I changed par() earlier. + +There are no obvious references for the methods used in this package, because they are essentially trival shifting, scaling and rotating of data (the latter using Cholesky decomposition of the sample covariance matrix). However, your suggestion has prompted me to include in the DESCRIPTION a citatio of the related datasauRus package. + ## R CMD check results 0 errors | 0 warnings | 0 notes diff --git a/vignettes/intro-to-anscombiser.Rmd b/vignettes/intro-to-anscombiser.Rmd index 6ecc32d..3a69a44 100644 --- a/vignettes/intro-to-anscombiser.Rmd +++ b/vignettes/intro-to-anscombiser.Rmd @@ -21,9 +21,9 @@ The `anscombiser` package is named after the famous Anscombe's quartet datasets ## Creating datasets with identical summary statistics -The `datasauRus` package (@datasauRus) provides further examples of datasets that have markedly different scatter plots but nevertheless share many sample summary statistics. These datasets were produced by using a simulated annealing algorithm that seeks to morph incrementally an initial dataset towards a target shape while maintaining the same sample summary statistics (@dpaper). In principle, any set of summary statistics can be used. Indeed, @datasauRus provides not only dataset that have the same values of Anscombe's statistics (essentially sample means, variances and correlation) but also datasets that are constrained to share the same sample median, interquartile range and Spearman's correlation. +The `datasauRus` package (@datasauRus) provides further examples of datasets that have markedly different scatter plots but nevertheless share many sample summary statistics. These datasets were produced by using a simulated annealing algorithm that seeks to morph incrementally an initial dataset towards a target shape while maintaining the same sample summary statistics (@dpaper). In principle, any set of summary statistics can be used. Indeed, @datasauRus provides not only datasets that have the same values of Anscombe's statistics (essentially sample means, variances and correlation) but also datasets that are constrained to share the same sample median, interquartile range and Spearman's rank correlation. -The `anscombiser` package takes a simpler and quicker approach to the same problem, using Anscombe's statistics. It uses shifting, scaling and rotating to transform the observations in an input dataset to achieve a target set of Anscombe's statistics. These statistics can be set directly or by calculating them from a target dataset, perhaps one of the Anscombe quartet. If the input dataset has statistics that are similar to the target statistics then the output dataset will look rather similar to the input dataset. Otherwise, the output dataset will be a squashed and/or rotated version of the input dataset, but the general shape of the input dataset will still be visible. It will be like viewing the input dataset from a different perspective. +The `anscombiser` package takes a simpler and quicker approach to the same problem, using Anscombe's statistics. It uses shifting, scaling and rotating to transform the observations in an input dataset to achieve a target set of Anscombe's statistics. These statistics can be set directly or by calculating them from a target dataset, perhaps one of Anscombe's quartet. If the input dataset has statistics that are similar to the target statistics then the output dataset will look rather similar to the input dataset. Otherwise, the output dataset will be a squashed and/or rotated version of the input dataset, but the general shape of the input dataset will still be visible. It will be like viewing the input dataset from a different perspective. Thus, we can easily create many datasets that have different general natures but share the same values of Anscombe's statistics. In addition, this method works in more than two dimensions.