Skip to content

Commit

Permalink
Nick/link check (#359)
Browse files Browse the repository at this point in the history
* Internal: Try new link checker

* Internal: Add codespell and fix typos.

* Internal: See if codespell precommit finds config.

* Internal: Found config. Now enable reading it

Closes #358
  • Loading branch information
ntjohnson1 authored Jan 6, 2025
1 parent 589239b commit c3249ad
Show file tree
Hide file tree
Showing 31 changed files with 114 additions and 64 deletions.
12 changes: 9 additions & 3 deletions .github/workflows/markdown-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,14 @@ on:
branches: [ "main" ]

jobs:
markdown-link-check:
check-links:
name: runner / linkspector
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: gaurav-nelson/github-action-markdown-link-check@v1
- uses: actions/checkout@v4
- name: Run linkspector
uses: umbrelladocs/action-linkspector@v1
with:
github_token: ${{ secrets.github_token }}
reporter: github-pr-review
fail_on_error: true
7 changes: 7 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,10 @@ repos:
--extra-keys=metadata.language_info metadata.vscode metadata.kernelspec cell.metadata.vscode,
--drop-empty-cells
]
- repo: https://github.com/codespell-project/codespell
rev: v2.3.0
hooks:
- id: codespell
args: [ --toml, "pyproject.toml"]
additional_dependencies:
- tomli
4 changes: 2 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
- Aligning comparison operator output for data classes (https://github.com/sandialabs/pyttb/pull/331)
- Improved:
- Getting starting documentation (https://github.com/sandialabs/pyttb/pull/324)
- Development enviroment (https://github.com/sandialabs/pyttb/pull/329, https://github.com/sandialabs/pyttb/pull/330)
- Development environment (https://github.com/sandialabs/pyttb/pull/329, https://github.com/sandialabs/pyttb/pull/330)
- Documentation (https://github.com/sandialabs/pyttb/pull/328, https://github.com/sandialabs/pyttb/pull/334)

# v1.8.0 (2024-10-23)
Expand Down Expand Up @@ -93,7 +93,7 @@
- Addresses ambiguity of -0 by using `exclude_dims` (`numpy.ndarray`) parameter
- `ktensor.ttv`, `sptensor.ttv`, `tensor.ttv`, `ttensor.ttv`
- Use `exlude_dims` parameter instead of `-dims`
- Explicit nameing of dimensions to exclude
- Explicit naming of dimensions to exclude
- `tensor.ttsv`
- Use `skip_dim` (`int`) parameter instead of `-dims`
- Exclude all dimensions up to and including `skip_dim`
Expand Down
16 changes: 11 additions & 5 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,19 +35,25 @@ current or filing a new [issue](https://github.com/sandialabs/pyttb/issues).
```
git checkout -b my-new-feature-branch
```
1. Formatters and linting
1. Formatters and linting (These are checked in the full test suite as well)
1. Run autoformatters and linting from root of project (they will change your code)
```commandline
ruff check . --fix
ruff format
```
```commandline
ruff check . --fix
ruff format
```
1. Ruff's `--fix` won't necessarily address everything and may point out issues that need manual attention
1. [We](./.pre-commit-config.yaml) optionally support [pre-commit hooks](https://pre-commit.com/) for this
1. Alternatively, you can run `pre-commit run --all-files` from the command line if you don't want to install the hooks.
1. Check typing
```commandline
mypy pyttb/
```
1. Not included in our pre-commit hooks because of slow runtime.
1. Check spelling
```commandline
codespell
```
1. This is also included in the optional pre-commit hooks.
1. Run tests (at desired fidelity)
1. Just doctests (enabled by default)
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ low-rank tensor decompositions:
[`cp_apr`](https://pyttb.readthedocs.io/en/stable/cpapr.html "CP decomposition via Alternating Poisson Regression"),
[`gcp_opt`](https://pyttb.readthedocs.io/en/stable/gcpopt.html "Generalized CP decomposition"),
[`hosvd`](https://pyttb.readthedocs.io/en/stable/hosvd.html "Tucker decomposition via Higher Order Singular Value Decomposition"),
[`tucker_als`](https://pyttb.readthedocs.io/en/stable/tuckerals.html "Tucker decompostion via Alternating Least Squares")
[`tucker_als`](https://pyttb.readthedocs.io/en/stable/tuckerals.html "Tucker decomposition via Alternating Least Squares")

## Quick Start

Expand Down
4 changes: 2 additions & 2 deletions docs/source/tutorial/algorithm_cp_als.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Increase the maximium number of iterations\n",
"## Increase the maximum number of iterations\n",
"Note that the previous run kicked out at only 10 iterations, before reaching the specified convegence tolerance. Let's increase the maximum number of iterations and try again, using the same initial guess."
]
},
Expand Down Expand Up @@ -337,7 +337,7 @@
"source": [
"## Recommendations\n",
"* Run multiple times with different guesses and select the solution with the best fit.\n",
"* Try different ranks and choose the solution that is the best descriptor for your data based on the combination of the fit and the interpretaton of the factors, e.g., by visualizing the results."
"* Try different ranks and choose the solution that is the best descriptor for your data based on the combination of the fit and the interpretation of the factors, e.g., by visualizing the results."
]
}
],
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorial/algorithm_gcp_opt.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
"tags": []
},
"source": [
"This document outlines usage and examples for the generalized CP (GCP) tensor decomposition implmented in `pyttb.gcp_opt`. GCP allows alternate objective functions besides sum of squared errors, which is the standard for CP. The code support both dense and sparse input tensors, but the sparse input tensors require randomized optimization methods.\n",
"This document outlines usage and examples for the generalized CP (GCP) tensor decomposition implemented in `pyttb.gcp_opt`. GCP allows alternate objective functions besides sum of squared errors, which is the standard for CP. The code support both dense and sparse input tensors, but the sparse input tensors require randomized optimization methods.\n",
"\n",
"GCP is described in greater detail in the manuscripts:\n",
"* D. Hong, T. G. Kolda, J. A. Duersch, Generalized Canonical Polyadic Tensor Decomposition, SIAM Review, 62:133-163, 2020, https://doi.org/10.1137/18M1203626\n",
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorial/algorithm_hosvd.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@
"metadata": {},
"source": [
"## Generate a core with different accuracies for different shapes\n",
"We will create a core `tensor` that has is nearly block diagonal. The blocks are expontentially decreasing in norm, with the idea that we can pick off one block at a time as we increate the prescribed accuracy of the HOSVD. To do this, we define and use a function `tenrandblk()`."
"We will create a core `tensor` that has is nearly block diagonal. The blocks are expontentially decreasing in norm, with the idea that we can pick off one block at a time as we increase the prescribed accuracy of the HOSVD. To do this, we define and use a function `tenrandblk()`."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorial/class_sptensor.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"metadata": {},
"source": [
"## Creating a `sptensor`\n",
"The `sptensor` class stores the data in coordinate format. A sparse `sptensor` can be created by passing in a list of subscripts and values. For example, here we pass in three subscripts and a scalar value. The resuling sparse `sptensor` has three nonzero entries, and the `shape` is the size of the largest subscript in each dimension."
"The `sptensor` class stores the data in coordinate format. A sparse `sptensor` can be created by passing in a list of subscripts and values. For example, here we pass in three subscripts and a scalar value. The resulting sparse `sptensor` has three nonzero entries, and the `shape` is the size of the largest subscript in each dimension."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorial/class_sumtensor.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@
"metadata": {},
"source": [
"## Creating sumtensors\n",
"A sumtensor `T` can only be delared as a sum of same-shaped tensors T1, T2,...,TN. The summand tensors are stored internally, which define the \"parts\" of the `sumtensor`. The parts of a `sumtensor` can be (dense) tensors (`tensor`), sparse tensors (` sptensor`), Kruskal tensors (`ktensor`), or Tucker tensors (`ttensor`). An example of the use of the sumtensor constructor follows."
"A sumtensor `T` can only be declared as a sum of same-shaped tensors T1, T2,...,TN. The summand tensors are stored internally, which define the \"parts\" of the `sumtensor`. The parts of a `sumtensor` can be (dense) tensors (`tensor`), sparse tensors (` sptensor`), Kruskal tensors (`ktensor`), or Tucker tensors (`ttensor`). An example of the use of the sumtensor constructor follows."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorial/class_tenmat.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We show how to convert a `tensor` to a 2D numpy array stored with extra information so that it can be converted back to a `tensor`. Converting to a 2D numpy array requies an ordered mapping of the `tensor` indices to the rows and the columns of the 2D numpy array."
"We show how to convert a `tensor` to a 2D numpy array stored with extra information so that it can be converted back to a `tensor`. Converting to a 2D numpy array requires an ordered mapping of the `tensor` indices to the rows and the columns of the 2D numpy array."
]
},
{
Expand Down
4 changes: 2 additions & 2 deletions docs/source/tutorial/class_tensor.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@
"metadata": {},
"source": [
"## Specifying trailing singleton dimensions in a `tensor`\n",
"Likewise, trailing singleton dimensions must be explictly specified."
"Likewise, trailing singleton dimensions must be explicitly specified."
]
},
{
Expand Down Expand Up @@ -136,7 +136,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## The constitutent parts of a `tensor`"
"## The constituent parts of a `tensor`"
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorial/class_ttensor.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -630,7 +630,7 @@
"metadata": {},
"source": [
"### Compare visualizations\n",
"We can compare the results of reconstruction. There is no degredation in doing only a partial reconstruction. Downsampling is obviously lower resolution, but the same result as first doing the full reconstruction and then downsampling."
"We can compare the results of reconstruction. There is no degradation in doing only a partial reconstruction. Downsampling is obviously lower resolution, but the same result as first doing the full reconstruction and then downsampling."
]
},
{
Expand Down
8 changes: 4 additions & 4 deletions profiling/algorithms_profiling.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@
" label:\n",
" The user-supplied label to distinguish a test run.\n",
" params:\n",
" Paramters passed to the algorithm function.\n",
" Parameters passed to the algorithm function.\n",
" 'rank' may be given to the CP algorithms; 'tol' and 'verbosity' to hosvd.\n",
" \"\"\"\n",
"\n",
Expand All @@ -108,7 +108,7 @@
" # stop collecting data, and send data to Stats object and sort\n",
" profiler.disable()\n",
"\n",
" # save profiling ouput to sub-directory specific to the function being tested.\n",
" # save profiling output to sub-directory specific to the function being tested.\n",
" output_directory = f\"./pstats_files/{algorithm_name}\"\n",
" if not os.path.exists(output_directory):\n",
" os.makedirs(output_directory) # create directory if it doesn't exist\n",
Expand Down Expand Up @@ -155,7 +155,7 @@
" label:\n",
" The user-supplied label to distinguish a test run. This will be used in the output file name.\n",
" params:\n",
" Paramters passed to the algorithm function.\n",
" Parameters passed to the algorithm function.\n",
" 'rank' may be given to the CP algorithms; 'tol' and 'verbosity' to hosvd.\n",
" \"\"\"\n",
"\n",
Expand Down Expand Up @@ -410,7 +410,7 @@
"source": [
"### Generating all algorithms' profiling images\n",
" \n",
"The cell bellow will generate all profiling images for all algorithms in `./gprof2dot_images/<specific_algorithm>`"
"The cell below will generate all profiling images for all algorithms in `./gprof2dot_images/<specific_algorithm>`"
]
},
{
Expand Down
20 changes: 20 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ dev = [
# Also in pre-commit
"ruff>=0.7,<0.8",
"pre-commit>=4.0,<5.0",
"codespell>=2.3.0,<2.4.0"
]
doc = [
"sphinx >= 4.0",
Expand Down Expand Up @@ -120,3 +121,22 @@ addopts = "--doctest-modules pyttb"
filterwarnings = [
"ignore:.*deprecated.*:"
]

[tool.codespell]
skip = [
# Built documentation
"./docs/build",
"./docs/jupyter_execute",
# Project build artifacts
"./build"
]
count = true
ignore-words-list = [
# Conventions carried from MATLAB ttb (consider changing)
"ans",
"siz",
# Tensor/repo Nomenclature
"COO",
"nd",
"als",
]
2 changes: 1 addition & 1 deletion pyttb/cp_als.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ def cp_als( # noqa: PLR0912,PLR0913,PLR0915
Example
-------
Random initialization causes slight pertubation in intermediate results.
Random initialization causes slight perturbation in intermediate results.
`...` is our place holder for these numeric values.
Example using default values ("random" initialization):
Expand Down
20 changes: 10 additions & 10 deletions pyttb/cp_apr.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ def cp_apr( # noqa: PLR0913
assert init.ndims == N, "Initial guess does not have the right number of modes"
assert (
init.ncomponents == rank
), "Initial guess does not have the right number of componenets"
), "Initial guess does not have the right number of components"
for n in range(N):
if init.shape[n] != input_tensor.shape[n]:
assert False, f"Mode {n} of the initial guess is the wrong size"
Expand Down Expand Up @@ -256,7 +256,7 @@ def tt_cp_apr_mu( # noqa: PLR0912,PLR0913,PLR0915
M.normalize(normtype=1)
Phi = [] # np.zeros((N,))#cell(N,1)
for n in range(N):
# TODO prepopulation Phi instead of appen should be faster
# TODO prepopulation Phi instead of append should be faster
Phi.append(np.zeros(M.factor_matrices[n].shape))
kktModeViolations = np.zeros((N,))

Expand Down Expand Up @@ -488,7 +488,7 @@ def tt_cp_apr_pdnr( # noqa: PLR0912,PLR0913,PLR0915

if isinstance(input_tensor, ttb.sptensor) and isSparse and precompinds:
# Precompute sparse index sets for all the row subproblems.
# Takes more memory but can cut exectuion time significantly in some cases.
# Takes more memory but can cut execution time significantly in some cases.
if printitn > 0:
print("\tPrecomuting sparse index sets...")
sparseIx = []
Expand Down Expand Up @@ -847,7 +847,7 @@ def tt_cp_apr_pqnr( # noqa: PLR0912,PLR0913,PLR0915

if isinstance(input_tensor, ttb.sptensor) and precompinds:
# Precompute sparse index sets for all the row subproblems.
# Takes more memory but can cut exectuion time significantly in some cases.
# Takes more memory but can cut execution time significantly in some cases.
if printitn > 0:
print("\tPrecomuting sparse index sets...")
sparseIx = []
Expand Down Expand Up @@ -989,12 +989,12 @@ def tt_cp_apr_pqnr( # noqa: PLR0912,PLR0913,PLR0915
delg[:, lbfgsPos] = tmp_delg
rho[lbfgsPos] = tmp_rho
else:
# Rho is required to be postive; if not, then skip the L-BFGS
# Rho is required to be positive; if not, then skip the L-BFGS
# update pair. The recommended safeguard for full BFGS is
# Powell damping, but not clear how to damp in 2-loop L-BFGS
if dispLineWarn:
warnings.warn(
"WARNING: skipping L-BFGS update, rho whould be "
"WARNING: skipping L-BFGS update, rho would be "
f"1 / {tmp_delm * tmp_delg}"
)
# Roll back lbfgsPos since it will increment later.
Expand Down Expand Up @@ -1384,7 +1384,7 @@ def tt_linesearch_prowsubprob( # noqa: PLR0913
max_steps:
maximum number of steps to try (suggest 10)
suff_decr:
sufficent decrease for convergence (suggest 1.0e-4)
sufficient decrease for convergence (suggest 1.0e-4)
isSparse:
sparsity flag for computing the objective
data_row:
Expand Down Expand Up @@ -1414,7 +1414,7 @@ def tt_linesearch_prowsubprob( # noqa: PLR0913

stepSize = step_len

# Evalute the current objective value
# Evaluate the current objective value
f_old = -tt_loglikelihood_row(isSparse, data_row, model_old, Pi)
num_evals = 1
count = 1
Expand Down Expand Up @@ -1613,7 +1613,7 @@ def get_search_dir_pqnr( # noqa: PLR0913
lbfgsSize = delta_model.shape[1]

# Determine active and free variables.
# TODO: is the bellow relevant?
# TODO: is the below relevant?
# If epsActSet is zero, then the following works:
# fixedVars = find((m_row == 0) & (grad' > 0));
# For the general case this works but is less clear and assumes m_row > 0:
Expand Down Expand Up @@ -1747,7 +1747,7 @@ def calculate_phi( # noqa: PLR0913
Pi: np.ndarray,
epsilon: float,
) -> np.ndarray:
"""Calcualte Phi.
"""Calculate Phi.
Parameters
----------
Expand Down
2 changes: 1 addition & 1 deletion pyttb/gcp/optimizers.py
Original file line number Diff line number Diff line change
Expand Up @@ -512,7 +512,7 @@ def lbfgsb_func_grad(vector: np.ndarray):

lbfgsb_info["final_f"] = final_f
lbfgsb_info["callback"] = vars(monitor)
# Unregister monitor in case of re-use
# Unregister monitor in case of reuse
self._solver_kwargs["callback"] = monitor.callback

# TODO big print output
Expand Down
2 changes: 1 addition & 1 deletion pyttb/hosvd.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ def hosvd( # noqa: PLR0912,PLR0913,PLR0915
ranks[k] = np.where(eigsum > eigsumthresh)[0][-1]

if verbosity > 5:
print("Reverse cummulative sum of evals of Gram matrix:")
print("Reverse cumulative sum of evals of Gram matrix:")
for i, a_sum in enumerate(eigsum):
print_msg = f"{i: d}: {a_sum: 6.4f}"
if i == ranks[k]:
Expand Down
6 changes: 3 additions & 3 deletions pyttb/ktensor.py
Original file line number Diff line number Diff line change
Expand Up @@ -945,7 +945,7 @@ def to_tenmat(
Mapping of column indices.
cdims_cyclic:
When only rdims is specified maps a single rdim to the rows and
the remaining dimensons span the columns. _fc_ (forward cyclic)
the remaining dimensions span the columns. _fc_ (forward cyclic)
in the order range(rdims,self.ndims()) followed by range(0, rdims).
_bc_ (backward cyclic) range(rdims-1, -1, -1) then
range(self.ndims(), rdims, -1).
Expand Down Expand Up @@ -1378,7 +1378,7 @@ def normalize(

if sort:
if self.ncomponents > 1:
# indices of srting in descending order
# indices of string in descending order
p = np.argsort(self.weights)[::-1]
self.arrange(permutation=p)

Expand Down Expand Up @@ -2300,7 +2300,7 @@ def viz( # noqa: PLR0912, PLR0913
>>> fig, axs = K.viz(show_figure=False) # doctest: +ELLIPSIS
>>> plt.close(fig)
Define a more realistic plot fuctions with x labels,
Define a more realistic plot functions with x labels,
control relative widths of each plot,
and set mode titles.
Expand Down
Loading

0 comments on commit c3249ad

Please sign in to comment.