Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move self diagnostic mostly to the SDK side and add more compatibilit… #1658

Merged
merged 1 commit into from
Oct 9, 2024

Conversation

LeiGlobus
Copy link
Contributor

@LeiGlobus LeiGlobus commented Sep 4, 2024

This refactors the endpoint specific self-diagnostic command into an executable script that is installed with the SDK.

  • The old command now redirects to the new one, with a deprecation warning.
  • The default is to print to console, with the -z option creating a zipfile as before.
  • Some env/test related code has been moved to make it more accessible, for example _home() and ensure_compute_dir() previously in tokenstore.py (this is now in compute_dir.py, incorporated with Reid's LoginManager related PR)
  • The tests in -endpoint have been moved and modified/added to in their new -sdk location
  • The hardware_report.py tests are now included by default, though there were some overlaps
  • Now an endpoint UUID can be specified, and the -e flag will try using the executor to run a hello world function/task on it
  • The endpoint config/logs and local endpoint related info depends on whether globus-compute-endpoint is installed
  • The cat() method has been slightly refactored to work outside of click.echo, and now takes a list of wildcards in order to able to sort all matched files alphabetically to group endpoint files consecutively
  • Because it's installed with the SDK, potentially on non-linux systems, windows/mac equivalent have been added when there's an easy replacement, but not all commands are available in all OSes
  • A few new modules (colorama etc) were used to print OS-flexible colored/highlighted text, though they are not preserved in the resulting diagnostic.txt output.

The new self-diagnostic syntax examples:

globus-self-diagnostic -k 1 -p       # the -k limits log file sizes in case giant, the -p prints output to console instead of creating a zip file
globus-self-diagnostic               # This generates a zip file and prints only the test headers that are run while they are running
globus-self-diagnostic -p -e ep_uuid # This prints to console and sends a new task to the specified endpoint UUID

@LeiGlobus LeiGlobus added the no-news-is-good-news This change does not require a news file label Sep 4, 2024
@LeiGlobus LeiGlobus self-assigned this Sep 4, 2024
@LeiGlobus LeiGlobus marked this pull request as draft September 4, 2024 19:51
@LeiGlobus LeiGlobus force-pushed the sdk-self-diag-sc-33842 branch from 30a4aa8 to 5c576cd Compare September 6, 2024 04:01
@LeiGlobus LeiGlobus marked this pull request as ready for review September 6, 2024 04:10
@LeiGlobus LeiGlobus force-pushed the sdk-self-diag-sc-33842 branch from 5c576cd to b926067 Compare September 6, 2024 04:32
@LeiGlobus LeiGlobus removed the no-news-is-good-news This change does not require a news file label Sep 6, 2024
compute_sdk/globus_compute_sdk/sdk/utils/__init__.py Outdated Show resolved Hide resolved
compute_sdk/tests/unit/conftest.py Outdated Show resolved Hide resolved
compute_sdk/globus_compute_sdk/sdk/diagnostic.py Outdated Show resolved Hide resolved
@LeiGlobus LeiGlobus force-pushed the sdk-self-diag-sc-33842 branch 2 times, most recently from 8c5d88e to fb7baec Compare September 30, 2024 19:42
Copy link
Contributor

@khk-globus khk-globus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a rebase is in progress. I'll continue when that's all done. Meanwhile, "comments in progress" as well! 😉

compute_endpoint/globus_compute_endpoint/cli.py Outdated Show resolved Hide resolved
compute_endpoint/globus_compute_endpoint/cli.py Outdated Show resolved Hide resolved
compute_endpoint/globus_compute_endpoint/cli.py Outdated Show resolved Hide resolved
compute_endpoint/globus_compute_endpoint/cli.py Outdated Show resolved Hide resolved
@LeiGlobus LeiGlobus force-pushed the sdk-self-diag-sc-33842 branch 4 times, most recently from 006e6f5 to 08a3db9 Compare October 2, 2024 17:21
@LeiGlobus LeiGlobus force-pushed the sdk-self-diag-sc-33842 branch from 08a3db9 to 6be0681 Compare October 3, 2024 19:25
compute_sdk/globus_compute_sdk/sdk/diagnostic.py Outdated Show resolved Hide resolved
compute_sdk/setup.py Outdated Show resolved Hide resolved
@LeiGlobus LeiGlobus force-pushed the sdk-self-diag-sc-33842 branch from 122cf7e to d6563e1 Compare October 4, 2024 16:44
Copy link
Contributor

@khk-globus khk-globus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good; a few minor comments. Reid has already brought up the items of any importance that stood out to me, so you two go ahead.

Beyond those, I do wonder if the default output should be minimal, along the lines of what was globus-compute-endpoint self-diagnostic -z. Right now, there's a bunch of header information for each test, which I'm thinking is largely superfluous for most users. Just the file is all that's necessary, and makes it similarly easier to reference, show, and request of users in #help. Perhaps adding -v/--verbose to emit these headers instead?

Meanwhile, I observe that at least one test appears to fail when it it's not obvious to me that it should:

$ ll /etc/os-release
lrwxrwxrwx 1 root root 21 Sep 10 07:18 /etc/os-release -> ../usr/lib/os-release

$ globus-compute-diagnostic
...
== Diagnostic: python:cat(/etc/os-release) ==
No file named -
...

But generally, I'm looking forward to getting this out -- it's a nice addition!

compute_endpoint/globus_compute_endpoint/cli.py Outdated Show resolved Hide resolved
compute_endpoint/globus_compute_endpoint/cli.py Outdated Show resolved Hide resolved
compute_endpoint/tests/unit/test_cli_behavior.py Outdated Show resolved Hide resolved
compute_endpoint/tests/unit/test_endpointmanager_unit.py Outdated Show resolved Hide resolved
compute_sdk/tests/unit/test_authorizer_login_manager.py Outdated Show resolved Hide resolved
compute_sdk/tests/unit/test_hardware_report.py Outdated Show resolved Hide resolved
compute_sdk/tests/unit/test_diagnostic.py Outdated Show resolved Hide resolved
Comment on lines 235 to 240
"""
Some following tests previously used fakefs, which apparently is no longer
compatible with the new structure with lower level system calls. See:

https://stackoverflow.com/questions/75789398/how-to-use-pyfakefs-pytest-to-test-a-function-using-multiprocessing-queue-via # noqa
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm; this begins to imply to me that these are no longer unit tests then. Crossing the process barrier is one indication of that. Semantic, perhaps, but I wonder if we can mock an appropriate something else, fixture-ize it, or do we move these tests?

On the other hand, I'm not looking too hard at this particular test file at the moment, so I may be missing something more fundamentally obvious.

compute_sdk/tests/unit/test_diagnostic.py Outdated Show resolved Hide resolved
@LeiGlobus LeiGlobus force-pushed the sdk-self-diag-sc-33842 branch from 497b1ef to 2d2152c Compare October 5, 2024 20:16
Copy link
Contributor

@khk-globus khk-globus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is nearing the stage of bike-shedding; I'm ready to have this go in. @rjmello ?

compute_endpoint/tests/unit/test_endpointmanager_unit.py Outdated Show resolved Hide resolved
compute_sdk/tests/unit/test_diagnostic.py Outdated Show resolved Hide resolved
compute_endpoint/globus_compute_endpoint/cli.py Outdated Show resolved Hide resolved
@LeiGlobus
Copy link
Contributor Author

Note that sometimes one of the executor tests (such as tests/unit/test_executor.py::test_executor_shutdown_cancel_futures ) fails only on PY 3.8. That test seems unlinked to any of the changes of this PR, and is an occasional failure. Furthermore, I installed 3.8.18 on both my mac and Ubuntu and that test passes repeatedly (at least 5/5 tries), so I'll attribute it to an environment weirdness and not a real failure, and re-running it til it passes is a valid strategy.

@LeiGlobus LeiGlobus force-pushed the sdk-self-diag-sc-33842 branch 3 times, most recently from 199836c to 7aaa2ee Compare October 8, 2024 18:56
rjmello
rjmello previously approved these changes Oct 8, 2024
@LeiGlobus LeiGlobus force-pushed the sdk-self-diag-sc-33842 branch from db64f91 to be324ac Compare October 9, 2024 15:56
@LeiGlobus LeiGlobus requested a review from rjmello October 9, 2024 16:06
@LeiGlobus LeiGlobus merged commit b4979c1 into main Oct 9, 2024
21 checks passed
@LeiGlobus LeiGlobus deleted the sdk-self-diag-sc-33842 branch October 9, 2024 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants