Skip to content

Commit

Permalink
Resubmission
Browse files Browse the repository at this point in the history
  • Loading branch information
paulnorthrop committed Oct 10, 2020
1 parent 112c7e9 commit 3ca7234
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 3 deletions.
5 changes: 4 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,10 @@ Description: Anscombe's quartet are a set of four two-variable datasets that
dataset, which is shifted, scaled and rotated in order to achieve target
summary statistics. The general shape of the initial dataset is retained.
The target statistics can be supplied directly or calculated based on a
user-supplied dataset.
user-supplied dataset. The 'datasauRus' package
<https://cran.r-project.org/package=datasauRus> provides further examples
of datasets that have markedly different scatter plots but share many
sample summary statistics.
Imports:
datasets,
graphics,
Expand Down
6 changes: 6 additions & 0 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
This is a resubmission.

I have reset par() at the end of the vignette because I changed par() earlier.

There are no obvious references for the methods used in this package, because they are essentially trival shifting, scaling and rotating of data (the latter using Cholesky decomposition of the sample covariance matrix). However, your suggestion has prompted me to include in the DESCRIPTION a citatio of the related datasauRus package.

## R CMD check results

0 errors | 0 warnings | 0 notes
Expand Down
4 changes: 2 additions & 2 deletions vignettes/intro-to-anscombiser.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@ The `anscombiser` package is named after the famous Anscombe's quartet datasets

## Creating datasets with identical summary statistics

The `datasauRus` package (@datasauRus) provides further examples of datasets that have markedly different scatter plots but nevertheless share many sample summary statistics. These datasets were produced by using a simulated annealing algorithm that seeks to morph incrementally an initial dataset towards a target shape while maintaining the same sample summary statistics (@dpaper). In principle, any set of summary statistics can be used. Indeed, @datasauRus provides not only dataset that have the same values of Anscombe's statistics (essentially sample means, variances and correlation) but also datasets that are constrained to share the same sample median, interquartile range and Spearman's correlation.
The `datasauRus` package (@datasauRus) provides further examples of datasets that have markedly different scatter plots but nevertheless share many sample summary statistics. These datasets were produced by using a simulated annealing algorithm that seeks to morph incrementally an initial dataset towards a target shape while maintaining the same sample summary statistics (@dpaper). In principle, any set of summary statistics can be used. Indeed, @datasauRus provides not only datasets that have the same values of Anscombe's statistics (essentially sample means, variances and correlation) but also datasets that are constrained to share the same sample median, interquartile range and Spearman's rank correlation.

The `anscombiser` package takes a simpler and quicker approach to the same problem, using Anscombe's statistics. It uses shifting, scaling and rotating to transform the observations in an input dataset to achieve a target set of Anscombe's statistics. These statistics can be set directly or by calculating them from a target dataset, perhaps one of the Anscombe quartet. If the input dataset has statistics that are similar to the target statistics then the output dataset will look rather similar to the input dataset. Otherwise, the output dataset will be a squashed and/or rotated version of the input dataset, but the general shape of the input dataset will still be visible. It will be like viewing the input dataset from a different perspective.
The `anscombiser` package takes a simpler and quicker approach to the same problem, using Anscombe's statistics. It uses shifting, scaling and rotating to transform the observations in an input dataset to achieve a target set of Anscombe's statistics. These statistics can be set directly or by calculating them from a target dataset, perhaps one of Anscombe's quartet. If the input dataset has statistics that are similar to the target statistics then the output dataset will look rather similar to the input dataset. Otherwise, the output dataset will be a squashed and/or rotated version of the input dataset, but the general shape of the input dataset will still be visible. It will be like viewing the input dataset from a different perspective.

Thus, we can easily create many datasets that have different general natures but share the same values of Anscombe's statistics. In addition, this method works in more than two dimensions.

Expand Down

0 comments on commit 3ca7234

Please sign in to comment.