-
Notifications
You must be signed in to change notification settings - Fork 0
CI meeting notes
Attendees: Heather, Brad, Aymen, Giannis, Matteo, Shantenu
- End User docs
- Create new tab for 'Getting Started' on header
- Brad Heather will work on page next week
- Next hackathon is Aug 14 at 1 pm ET
- LandCover middleware
- keep image discovery for concurrency
Attendees: Brad, Giannis, Aymen, Heather, Matteo, Shantenu
- New tiling
- LandCover middleware
- Brad is working on middleware script with 5 stages
- Rivers middleware
- In work
- Hackathon review
- Matteo is developing document for Jupyter notebooks
- Brad is developing document for mock-up of end user instructions
Attendees:
Heather and Bento meet on simplifying Seals and Penguins parameters.
Attendees: Brad, Giannis, Aymen, Samira
- Rivers PR
- Giannis will test Seals tiling for Rivers to eliminate gdal calls
- Other
- Aymen will update middleware to python3
- Brad will work on entk_script for LandCover
Attendees: Brad, Giannis, Aymen, Heather, Shantenu, Matteo, Bento, George, Abdullah
- Rivers PR
- Penguins practice hackathon - https://pypi.org/project/iceberg-penguins.search/
No meeting
Attendees: Brad, Giannis, Aymen, Matteo, Shantenu
- packaging/installation testing/debugging hackathon
- Idea is to have use case teams swap packaging to evaluate usability of documentation and ease of installation.
- Brad to propose possible dates for LandCover and Rivers to work on Penguins/Seals in late June, then LandCover and Rivers swap when their respective packaging is ready.
- imports in python
- Brad fixed some of the paths and will look at other subdirs to see if there are others to fix before trying to install and run on bridges again.
- LandCover PR and notebooks
- PR approved
- Helen, Brad, Brian working on Notebooks
- PEARC paper
- completed copyright transfer
- Brad is working on expanding the paper - send him any ideas or edits
- due Mon, May 18
- Other
- IWP - Chandi/Ehsan working on tensorflow update
- Gulf Coast - Giannis has pipeline complete. Needs to evaluate run time. There may be restrictions on bridges if jobs run more than 12 hours.
Attendees: Brad, Giannis, Aymen, Samira
- Review of all issues/tickets.
- many issues closed, comments added, and pull requests merged
- project page updated
- Upcoming Rivers and LandCover PRs
- Samira pushed complete pipeline in python, will separate tiling, predicting, mosaicing tasks (3-4 weeks)
- Giannis can move ahead with one-script pipeline
- LandCover pipeline complete, awaiting Helen's approval to issue pull request
- Brad is looking into memory issue for class.py (landcover) on bridges RM-small. Aymen suggests looking at nvprof tool.
- Finish Penguins packaging
- Giannis fixed setup.py so rasterio works with numpy
- Aymen will push final commits eliminating visualization today
- Then, Brad will test install, merge with devel, merge with master, and post to pypi
- Gulf coast - Brad sent Giannis more data for testing
- IWP - up and running
Attendees: Brad, Giannis, Aymen, Matteo, Samira
- Tickets
- Disc quota per user
- got supplement to 15 TB (~12 for ICEBERG)
- will set a soft limit of 1 TB per user to be notified of any large changes
- Python conversion of Rivers
- CI will move forward with matlab version from devel_samira, while Samira modifies training output to work with python
- IWP
- Aymen will test with new Bridges solution
- Penguins packaging interactive session
- removed conda from Brad's .bashrc
- removed unneeded dependancies
- reordered dependencies in setup.py
- there is still some issue with numpy (see commit cb56a4c)
Attendees: Aymen, Brad, Giannis, Samira, Shantenu, Matteo, Heather
- /pylon5 space
- currently 400GB over our 4TB allocation
- Samira has no imagery to work with
- Brad will hassle users to clean up
- Giannis will submit ticket for tips on how to monitor and maintain
- penguins packaging
- still in work. Brad and Aymen will meet next week.
- IWP
- There seems to be an intel configuration change on bridges that broke the code. Aymen will submit a ticket.
Attendees: Brad, Giannis, Aymen
- XSEDE allocation
- 125K GPU hours on bridges (110 dedicated to ICEBERG)
- 100K GPU hours on comet
- only 4 TB on /pylon5, we will clean up what is not needed there now and can submit a supplement request if needed.
- Reminder that Jetstream will be down this Sunday 3/15 9-12 CDT
- Rivers PR
- Feature/entk_pipeline merged to devel_samira and branch deleted
- devel_samira not merged yet (will be when we have replaced matlab code)
- LandCover PR
- experiment/classification merged to devel
- branch not deleted (need to add shapefile step)
- IWP
- Aymen will start profiling now that Chandi/Ehsan reviewed output and provided test data
- there may be read/write bottlenecks to watch out for
- results expected mid-April (4/17)
- Penguins release with middleware due Mar 31.
- Rutgers paper update - due today! Aymen will present at next AHM.
Attendees: Brad, Giannis, Aymen, Heather, Shantenu, Matteo
- decided to shift CI meeting back to 3 pm ET. AHM remains at 1 pm for this semester.
- Tickets - Rivers PR, LandCover issue #56
- Waiting on EnTK "tags" issue fix, or can use older version (which works for seals) and test.
- Brad will use LandCover and Penguins python code to replace matlab.
- Brad will meet with Helen next week.
- Penguins integration with the middleware
- 4D geolocation will also be integrated. ASIFT and RANSAC are complete. Will check with PGC to see if they code for geolocation and orthorectification.
- EarthCube abstract/notebooks
- Perhaps Landcover or Rivers can attend AHM.
- Will create "tutorial" notebooks based on http://radical-cybertools.github.io/radical-pilot/basic_example.html
- Containers
- Users familiar with containers will use them (maybe expect them), but they are not required for functionality or by NSF.
- Rutgers paper update
- deadline extended to Mar 13.
- XSEDE proposal https://docs.google.com/document/d/10EJpggTwYsgTUZghKzz2Rfnre5KeCrfWuhnTp8K5GQc/edit
- Penguins/Middleware packaging - final cleanups and posting
- IWP - dependencies
- Gulf Coast - possible NSF supplement
- Rivers - can EnTK/RP script be developed from what we have?
- 2020 - deadlines, conferences, moves, etc.
- Calculated core hours and storage needed for each use case
- Giannis and Brad will finish next week
- Created new ticket so Aymen can have access to AI-GPU
- Working errors with Tylar and PGC
- Need model from Samira asap
- Invited to submit FGCS paper arising from escience paper. 2-3 other papers planned. PEARC deadline is Jan22. Brad is moving to CO, but will be working remotely.
- XRAC proposal
- Penguins packaging/middleware
- Seals pull request
- IWP - test data and allocation
- Rivers - xsede ticket and AGU
- Due Dec 15, draft is here. Brad to send to PI teams for input.
- Check dependencies and inputs.
- Check file paths.
- IWP provided test data. They will submit their own XSEDE proposal.
- Ticket closed, keras module now includes skimage. Samira will present poster and EarthCube town hall at AGU and meet with possible collaborators.
Other: LandCover pull request - Giannis to update travis and pylint
- Rivers (Samira will join call) - XSEDE ticket opened
- Penguins packaging
- IWP
- Gulf Coast - forwarded issue #23 to Matt
- Seals pull request
- LandCover - atmcorr spectrum posted, meeting Helen next week to work on classificationTickets
- Samira opened ticket but gets permission error, will contact XSEDE again.
- Can use and modify setup.py from Seals.
- Aymen and Ehsan will meet next week.
- USF-IMARS should remedy syntax error on their end.
- Will check with Bento on syntax error and what version dependencies can be removed.
- We should merge atmcorr updates. Meeting with Helen next week.
Other: confirmed Dec 6 f2f at Rutgers.
- Tickets
- Documentation (Readthedocs,wiki, notebooks)
- Timeline for Rivers release
- Matt McCarthy - Gulf coast landcover
- Going through issues
- Start generating auto documentation from code development. Brad created a branch with notebooks that contain examples on how to use ICEBERG.
- River is to be released in a month from now.
- XSEDE proposal to go in October 15th. ICW to be included.
- Presentation from Matt about the workflow.Next steps:
- Matt and Tylar send us the github link to their code.
- Rutgers and Brad create EnTK Ensemble script around code.
- Test on XSEDE
- IWP - Chandi, Ehsan, Anna
- Summary of Samira's visit to Rutgers
Item 1: Bluewaters is the largest machine supporting right now image processing. RADICAL has used Bluewaters but not for image processing. Bluewaters GPUs are pretty old. gapminder.org
will be replaced by permafrost discovery repository. XSEDE software and hardware environments are different from commercial clouds. ICEBERG appreciates the software and hardware NSF supported supercomputers building blocks to build end to end data analysis pipelines. The analysis pipelines from ICEBERG are somewhat similar. Make sure that the ICEBERG middleware can support the end to end use case and separate the support of the use case from resource allocation. Need to understand the performance of the individual components. High level plan: 1) Get the components integrated and running on Bridges, 2) performance analysis, and 3) find the suitable platform to execute. There is no scaling experiment that allows us to know what are the allocation requirements. Bluewaters wants to move forward for high-res imagery. We can fork IWP repo in ICEBERG and does not change ownership and code control, since we are integrating with ICEBERG. We are concentrated on prediction and not training necessarily.
Item 2:
- Codecov
- Samira visit
- Workshop - XSEDE accounts; who is presenting what
- Release status
Item 2: Three point meeting: 1) CNN, 2) Git, Linux, Bridges, 3) RADICAL-EnTK pipeline on Bridges. There are 300 images on UCSB, these are not on Bridges. Item 1: Codecov what is purpose of codecov and how to interpret it. Item 3: The CI team will be probably the supporting while Brad will be presenting the demos and such. PNGO and PGC are coming to the workshop. ICEBERG requirements and moving forward will be based Item 4: We are on time. Aymen should open PR next days.
- Penguins Use Case
- Rivers Use Case
- Workshop resource requirements
- Use case packaging - First project wide release.
- Workshop agenda
- XRAC proposal
- Tickets
- Hieu and Aymen met. Aymen is developing RP tools for this use case.
- Samira, Brad, and Giannis met this week. Giannis is working on running rivers on bridges. There is some conflict between keras and python. Rivers use case also led to discussion of Travis requirements. What is needed, what is overkill? We will present at next All-Hands Meeting.
- Giannis will update requirements and Brad will submit next week.
- Release decision to be made Aug 1. Will include at least Middleware and Seals and possibly Penguins.
- Workshop agenda
- Giannis to verify the allocation request for ICEBERG and update the proposal. Request 20TB of space initially and expand to 100TB progressively. Due Jul 15.
- Rivers on bridges
- Tickets: - CLI config file pull request - Penguins pull request
- Distribution strategy
-
Samira joined the call to clarify the bridges setup. To do:
- translate tiling from Matlab to python - Samira to provide 3 band mosaic code on github and 3 band images on google
Samira is planning a visit to SBU or Rutgers in July.
-
Config file pull request approved and merged. Aymen reinstalled Penguins code and it works on bridges.
-
EnTK will be distributed separately from domain codes. New issue opened to package Seals for pip install.
- Discussion with PGC.
Tickets:
- Tickets
- Penguins
- Release readiness
- XSEDE workshop
- Ready use case kernels for release are: Seals, and Penguins. Landcover probably will be there, Rivers probably not.
- Ask for a Jetstream VM for two weeks.
- Tickets
- LandCover pull request
- Virtual environments on bridges
- XSEDE accounts for Hieu, Helen, Brian
- New use case "application" process
- Seawulf access
- Closed several outstanding Penguin issues. Aymen is reviewing pull request. Also, will publicize the tiling code via the ICEBERG website.
- Helen did a code walkthrough of the LandCover pull request. Pull request accepted and merged!
- Create a script that installs everything that we need on Bridges and provide it as an executable. Virtualenv ticket opened.
- Sent XSEDE portal address to general channel to open accounts.
- Discussed what is the process for new use cases. From the CI perspective we need a use case document and a SRS document.
- Document produced by Giannis is sufficient. Shantenu should request the access.
- Tickets and issues
- RADICAL access to Seawulf
- Penguins
- Rivers
- Any other business
Item 1: Close all tickets from ASIFT use case and reopen then when the use case picks up again. ToDO Aymen.
Item 2: RADICAL can have access to Seawulf, Shantenu should set a proposal. It is a SLURM cluster. We can arrange a meeting with Seawulf support if assistance is needed.
Item 3: Penguins in Bridges by next week.
Item 4: Giannis to give the code a try until next Friday.
Item 5: Workshop around 15-20.
- Issues and PRs
- ICEBERG Command
- Workshop
Item 3: Giannis to do a back in the envelope calculation with the memory consumption of RP and EnTK for 20 users
- Pull requests from use cases
- Rivers code on Bridges
- ASIFT
- Any other business
- Cosmetic comments to all use cases.
- Rivers is almost ready to run on Bridges.
- There are two PRs open for ASIFT, one from Mike and one from Aymen. These need to be merged after reviews. Aymen works in a image parser and generates a CSV file for the task arguments. A PR will be opened in a week from now.
- Giannis created a document to define input parameters and expecting response from sites.
- Possibly create a proposal for workshop users.
- Seals use case PRs #48 and #44. Discussion on Validation suite.
- Penguins use case
- ASIFT
- Rivers
- Testing
- Other business
Item 3: Mike talked to Dr. Christoffer Heckman. He suggested that the ASIFT implementation results are fed to a neural network to classify when a result is good or not. There is meeting with David Crandall on Monday and Aymen will join the call. There is code for RANSAC filter and tilling preprocessing.
Item 1: We are waiting Bento's approval for PR #48 that will trigger a release to master. Brad or Bento will followup for issue #17
Item 5: Giannis has developed a document on testing. We will add more code examples and distribute next week. We will use Travis for all other use cases. No code will be pushed to devel branches unless it passes the Travis test. Create a template about all kind of test that we would like to perform for every use case, and clear instructions about Travis. We will have a Q/A session for all use cases regarding the four tests
Item 2: Hieu pushed his code and has a fix1 branch for fixing and breaking out his code.
Item 4: Started pushing the training code in the devel branch of their repo.
- Seals use case
- ASIFT
- ICEBERG Middleware and User Interface
- Testing
- Next Use Cases
- Next FtF
Item 1: Issue 46 is blocking the release. PR 48 is holding it. There might be a need for new repos for reproducing paper results.
Item 2: The provided dataset contains corrupted images and Aymen obtained a new dataset from Utah university. A stress test with 10k by 10k pixel tiles is being executing. Progress on ASIFT is becoming a serious bottleneck, project PIs should resolve the bottleneck.
Item 3: 1) ICEBERG 1.0 can be command line. It should support NSF HPC and any extra strategic resources. It should support scalable pipelines, and a level automation where these pipelines should land (i.e. the type of resources). Ideally 1.0 should be functional and include an autonomic middleware for resource selection. 2) Do we need an interface like the on ArcGIS or ArcGIS as an interface? Shantenu's suggestion: Start probing the Web interfaces capabilities and see where are we going.
Item 4: Create acceptance test infrastructure to make sure that any development is as up to the standards of the ICEBERG middleware. Developers from the science sides are responsible to produce unit tests for their code.
Items 5: Rivers are working on development and getting up to speed with Linux and XSEDE. Landcover has no new development since the team is in the field.
Item 6: Next FtF should be in the second week of Feb. Probably Feb 22nd.
- Rivers Use Case
- Seals Validation Suit
- ASIFT CPU-GPU comparison
- ICEBERG middleware discussion
- Testing
- Next use cases
Item 1: Discussed how to connect to XSEDE and Bridges with Enbo and Samira (Rivers). How to setup the environment and how to execute something small. Item 2: As soon as Bento says okay with the output, merge PR #44
- Seal Use case Milestone
- XRAC proposal
- Testing and continuous integration
- April Release
- Landcover
Item 1: Data transfer is starting on Monday. Seawolf will be shutdown from December 24. The milestone is moved 3 weeks ahead. ToDos: Get the whole dataset on Bridges (Brad). Make sure memory consumption and process launching is not an issue on Bridges (Giannis).
Item 2: Try to have all the input from next week, so that Giannis can finalize the text
Item 3: Introduced testing concepts and Travis. Programmers from each site should try to participate during the CI meetings. Giannis should send documentation and examples for testing.
Item 4: Try to come up with an ICEBERG middleware and try to include Seals, Penguins, Atmospheric correction, and other preprocessing modules. We want to see whether ASIFT can be included.
Item 5: Got an update for development phase.
- Seal Use case Milestone progress
- ASIFT Use case Milestone progress
- XRAC proposal for ICEBERG
- December 7th FtF
Item 1: We expect a pull request latest on Tuesday for the kernels to be in sync with devel. New models are created, expect them in 4-5 days. Data are moving to Bridges. Item 2: Brad will contact Mike. Try to find a way to speed up this use case. The same with Rivers use case Item 3: Use case paragraphs will be filled. Domain science teams will provide us with some computation estimates. CI team should change those to resource requests. Item 4: Start thinking for 3 three day workshop during 2019.
- XSEDE supplement
- performance and scaling for website
- use case tickets
- other business
Item 1: Giannis and Shantanu will create the supplement. Brad has to create the Globus and move the data to Bridges Item 2: Matteo suggested to add a scaling plot of the kernel that Bento is using. Item 3: Bento has a new kernel that will be incorporated in the pipeline. Start a preliminary discussion for Landcover use case pipeline and iterate in the use case document. Item 4: Brad had a meeting with the website developers and they started adding content to the website.
- Tickets
- Milestone Progress
- Codacy
- Any other business
Item 1: Giannis is responsible for Seals #34. ASIFT Repo is going to be restructured as soon as Mike commits code for Phase 2. Item 2: Seals and ASIFT milestone projection is still valid. Item 3: Send email to PSC for zoom meeting. Item 4: Remove Codacy from Use case repos. Giannis will try to find a Github hook or similar service for code cover. Item 5: Prepare Landcover and River's use case for next Friday.
- Validation Suites for all use cases: Discuss what the input/output might be.
- SRS documents. As per Heather’s request let’s try to have a version 1 of the Seals use case and then find the common parts between use cases.
- Tickets from use cases
- Any other business
Item 1: There is an access point to Google Drive with the input and output
Item 2: Finalized the Seals use case SRS document.
Item 3: Discussed the tickets on Seals, and ASIFT based on the milestones:
Item 4: Discussed LandCover use case: 1) Brad send a Pipeline figure in the CI mailing list. 2) The whole dataset is about 100TB with 30k images 3) Open a ticket with XSEDE about data policy for 100TB