-
Notifications
You must be signed in to change notification settings - Fork 0
Rutgers Activity
Ioannis Paraskevakos edited this page Aug 7, 2020
·
135 revisions
- Aymen is working on implementing IWP wrapper to terminate the code whenever the result (.shp files) are written on the local file system of the remote resource.
- Aymen as a part of the CI task to transition the Middleware to notebooks, is reading about the Juputerhub.
- Updating Rivers entk script to run at scale
- Giannis and Aymen transitioned ICEBERG-middleware to python 3 and updated the penguins and middleware packages.
- Aymen Integrated the Penguins use case with the ICEBERG-MIDDLEWARE under this pull request here
- Aymen downgraded the IWP-env packages and disabled the
Tenforflow
warnings in an attempt to make the IWP code exist execution normally (Failed).
- Debugging Gulf coast use case pipeline.
- Developing IWP EnTK pipeline (work is on hold now due to this ticket #9)
-
Decomposed the IWP main code into (Divide-Inference-Stitch).
-
Restructured the code from MPI fashion to serial fashion (no internal parallelization) here.
-
We have profiled the code in a slightly different way to define the bottlenecks of the code using
vprof heatmap
:from vprof import run def divide(input_image) some work run.runner(divide,input_image)
-
We target three important metrics here: (1)Execution time (TX) and (2)Physical memory consumption (MEM) and GPU usage.
-
We have profiled 1 GeoTIFF image with a size of 1.6 GB (20,000 x 18,500 pix) on
XSEDE Bridges
using2 P100 GPUs
:Divide image profiling (CPU only): TX : 47.52 seconds. MEM : 2097 MB.
Inference image profiling (GPU/CPU): TX : 55 minutes. MEM : 12.06 MB. GPU : +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 3208 C ...n/anaconda3/envs/polygons/bin/python3.6 15649MiB | | 1 3209 C ...n/anaconda3/envs/polygons/bin/python3.6 15649MiB | +-----------------------------------------------------------------------------+
Stitch Image profiling (CPU only): TX : 52.12 seconds. MEM : 895 MB.
-
Implementing the IWP PST model (the first stage is done here)
- Started developing the workflow script for Gulf Coast
- Working on ICEBERG-middleware issues #16 and #27
- The IWP issue is solved(Today) and thanks to XSEDE help desk.
- Working on Profiling the IWP code.
- Working on integrating the Penguins to the Middleware.
- Working on Ticket #14
- Integrated the Penguins use case with the middleware.
- Working on profiling the IWP code.
- The next steps are:
1. Implement the IWP use case as stages and tasks using the PST model with EnTK.
2. Execute the IWP use case at scale.
- Finalized code for executing rivers use case at scale
- Working on the Geolocation use case for an extended version of the eScience paper.
- Working on integrating Penguins with the middleware
- Working on Gulf coast use case. I am able to run based on the latest instructions from Tyler.
- Aymen is still working on the Geolocation use case for the extended paper (eScience 2019). if you are interested in the work please check my commits :). https://github.com/iceberg-project/Geolocation
- Aymen is working on the Geolocation use case for the extended paper (eScience 2019). if you are interested in the work please check my commits :). https://github.com/iceberg-project/Geolocation
- Finalizing XRAC proposal for 2020.
- Working on issue #16 of ICEBERG-middleware
- Working on IWP.
- Working on an extended version of eScience paper
- Testing Gulf coast use case on Bridges.
- Finalizing XRAC proposal for 2020.
- Working on issue #16 of ICEBERG-middleware
- Got access to Ehsan GPU-AI allocation and based on this ticket here.
- Opened another ticket regarding the IWP
iwp_inference.py
code not running on Bridges and based on this ticket here here. - Working on running the entire provided Penguins dataset using (RADICAL-Pilot) script and based on this ticket here
- IWP : We had a meeting with Ehsan, and we are working on fixing the code to run on Bridges.
- Tested Gulf coast use case on Bridges.
- Environment used: Conda Python 3.7 with
scikit-image
andgoal
installed byconda-forge
channel - Found a series of syntax errors. Ticket
- Environment used: Conda Python 3.7 with
- Reviewed Seals issue 62 pull request from Brad
- Working on fixing
gdal
polygon readings on Bridges.
- Still trying to run the IWP code on Bridges with P100 GPUs.
- Working with Ehsan on the ticket (DL model).
- Working on porting IWP code to P100 nodes.
- Requested imagery from Ehsan in this ticket https://github.com/iceberg-project/Ice_wedge_polygons/issues/2
- Fixed ICEBERG Seals python dependencies for Bridges and uploaded a new release on PyPI.
- Working towards ICEBERG-middleware release.
- Created a Jetstream image for ICEBERG middleware. Working with TACC to resolve deployment issues.
- Creating presentation for eScience conference
- Penguins RP code is ready to run and merged with ICEBERG release.
- Strong/Weak Scaling experiments are coming soon.
- IWP code requires GPU-AI nodes require a proposal to get access to them.
- ICEBERG Seals v1.2.1 uploaded in PyPi
- ICEBERG Middleware development is progressing. It gets installed from PyPi with the named as ICEBERG. Currenctly testing Seal use case support.
- Penguins RP Code was implemented and under the testing here.
- Pull request from /feature/rp_script to /devel was created.
- Running and profiling the IWP code on Bridges.
- Create a pipeline to execute the Rivers use case on Bridges. It currently succeeds with 3 images. Pull Request #14
- ICEBERG Middleware development is progressing. Included a dataset discovery class. Next step is to include the Seals use case.
- Preparing ICEBERG middleware to be installable through PyPi.
- eSciense paper, titled "Workflow design analysis for High resolution satellite Image analysis" was accepted. Starting working on addressing reviewers comments.
- Create a Radical Pilot Penguins code.
- A new branch under Penguins repository was created here.
- Pull request will be created soon to merge both branches.
- Executed Rivers code on Bridges. Working on a pipeline to run at scale.
- Started working on ICEBERG middleware to support Seals.
- Executed Penguins code on Bridges successfully.
- Designed and implemented Radical Pilot code for the Penguins use case, code is in testing mode here RP_penguins.
- PR from minor_fix to devel is merged now.
- Ticket #7 is solved and closed.
- Testing the rp_penguins.py code on bridges.
- scale experiments are coming soon (The entire penguin's dataset).
- Submitted a paper to eScience 2019. Acceptance notification is July 8th. An arxiv version of the paper can be found here
- Started working on Rivers use case. First priority is to run the code on PSC Bridges.
- Working on Bridges RP code.
- Debugging Penguins code.
- Writing a paper based on the Seals use case. Title: Workflow Design Analysis for High Resolution Satellite Image Analysis
- Tile_planner.py (verbose) bug fixed.
- Ticket #30 is officially closed now.
- mmacferrin pull request was accepted and merged with devel branch.
- Working on running Penguins code on Bridges, issue #7 still not solved.
- New ticket was opened regarding the hardcoded file paths in the penguins repo, issue #8.
- Created Anaconda environment to run the penguins code on XSEDE Bridges.
- Testing the Penguins code on Bridges is failing due to this ticket #7.
- Working on closing the current pull request for the ASIFT use case.
- Working on Matching the Aerial and Satellite images (provided by Heather and Brad).
- Created an ICEBERG command line. There is an open Pull Request discussing bugs and usability of the command.
- Writing a paper based on the Seals use case. Deadline April 29th.
- Implementing fully working pipeline (Image parsing - Keypoint Generating - Ransac filter).
- Creating a unit test for the Image parser.
- Creating PR for feature/entk, under review.
- Closing tickets #19, #20, #22, #23, #25, #27.
- Working on the Seal use-case experimental design.
- Did a spot check memory profiling in the tilling and GPU tasks. Tiling task memory usage cannot be explained by image - maximum reached 20GBs. We hypothesize it may be affected by the image attributes. CNN remains at 2 GBs regardless of image size. Currently, this reduces the maximum concurrency to 4 tiling tasks and 2 GPU tasks.
- fast_imas_IPOL is now cloned under iceberg project.
- Fixing Ransac code based on the issue 27.
- Pull request fix_issue27 was merged.
- Implementing the EnTK pipeline with 2 stages (Keypoints generation and Ransac filter).
- Working on a new experiment to characterize the performance of the entire EnTK pipeline.
- Working with Giannis to characterize different tiling methods.
- issue 22 and 27 are closed now.
- Giannis is working on the Seal use-case experimental design.
- 4D geolocation descriptors (Phase1 keypoints generation) characterization. We did characterize the performance of the fast_imas_IPOL for (Surf-Sift-RootSift),and we characterized the CudaSift descriptor as well.
- The performance characterization was done in terms of TTX (Total time to execute) .
- We have used GeoTIFF LANDSAT Image mosaic of Antarctica (LIMA) for testing part.
- We started with 2000 x 2000 pixels tiles up to 5000 x 5000 tiles.
- We did not apply any image transformations on the image Paris.
- Performance characterization was done on XSEDE Bridges.
- Seals release v1.1 is done!
- Giannis created a Testing Protocol document. Link: Testing Guidelines
- Giannis introduces a new experimental implementation for the Seals use case.
- Meeting Notes between Aymen and Mike. Giannis and Matteo participated as well:
- ASIFT workflow Diagram will be updated by adding another stage “Image preprocessing”.
- Current Key points generation algorithm is default SIFT.
- Measure the accuracy of each descriptor (currently only human verification can measure how accurate-Mike will update this).
- Characterizing the current ASIFT implementation (SIFT/Root SIFT/SURF) in terms of TTX, false positive and false negative matches.
- Command Line Interface ticket #15 (closed).
- Read me files ticket #21 (closed).
- Forking the ASIFT GitHub repo.
- Aymen is compiling and executing the new ASIFT code.
- If the code passed the compiling and processing step Aymen will adapt the code in the EnTK pipeline and will start working on the weak and strong scaling.
- Aymen created new performance characterization between the current CPU(SIFT/ROOT-SIFT) implementation the plot is here
- We executed the whole dataset and found a bug in the code. This resulted to PR #48
- The underlying infrastructure under utilize the resources. We are investigating to understand.
- Executed Seals Use Case with all images. Working on understanding performance and creating an execution model. Profiling Notebook
- Working on Validation Suite and README files for users.
- Seals Milestone progress
- Aymen created plots for the GPU SIFT compared to the CPU SIFT.Notebook
- Executed Seals Use Case with 3096 actual images. Profiling Notebook
- Seals Milestone progress
- Aymen created plots for the GPU SIFT compared to the CPU SIFT.Notebook
- Pull Request #42 of the Seals repo provides a fully tested infrastructure for the Seals use case.
- Seals Milestone progress
- Aymen is running another stress test(CPU-ASIFT) here with the current dataset and replicating Mike's results.
- Aymen is investigating and profiling the (GPU-ASIFT) here
- Executing a scaling test on Bridges up to 4608 tasks using the actual implementation for the milestone. Preliminary data can be found here
- Seals Milestone progress
- Testing infrastructure with actual images.
- Aymen updated the ASIFT Jyputer Notebook and added a brief explanation about the generated plots.The Preliminary Plots can be found here
- Executing a scaling test on Bridges up to 4608 tasks using the actual implementation for the milestone. Preliminary data can be found here
- Seal Milestone progress
- Aymen updated the ASIFT Jyputer Notebook and added a brief explanation about the generated plots.The Preliminary Plots can be found here
- Executed a scaling test on Bridges up to 4608 tasks were submitted issue. We are executing more tests on 4608 and slowly moving to the full blown use case.
- Seal Milestone progress
- Giannis worked on Seals SRS with Brad
- Aymen Profiled the ASIFTv2.1 and crerated this notebook here
- Giannis resolved the EnTK issues that Brad faced in issue #33 of the Seals Repo and updated EnTK instructions.
- Giannis restructured the Seals repo.
- Seal Milestone progress.
- Aymen Last Monday (october/8/2018) Meeting with Mike :
- Aymen opened a ticket for the ASIFT_SRS document link for the ticket is HERE.
- Aymen opened a ticket for the ASIFT PHASE2(RANSAC QUALITY FILTER) with Mike to follow the progress of the Phase2 ticket is HERE.
- Aymen and Mike decided to redesign the workflow diagram for Phase3 and Phase4.
- Aymen is profiling the ASIFTv2.1 without the EnTK on Bridges.
- Aymen Implemented the adaptive EnTK script based on the new ASIFTv2.1 supporting GEOTIF imagery using GDAL python library ASIFTv2.1 EnTK Code
- Aymen partially updated the ASIFT srs document and will keep updating the document with the new requirements.
- Aymen Started new experiments for the new Entk ASIFTv2.1 code.
- Giannis implemented feature 4.1 of the SRS document, as well as, Command Line Interface.
- A Pull Request is submitted. Link can be found here
- Identified that the filesystem used and the used method of execution are the main reasons behind the behavior of the strong and weak scaling results. This notebook shows the methodology used.
- We had a meeting with Brad that started laying down the next steps for the two use case under development
- Giannis and Shantenu attended the Earthcude workshop last week.
- Aymen is Investigating different Image matching and Image comparing algorithms.
- Aymen Implemented the adaptive EnTK script and finished the Full Pipeline stages and committed the code to GitHub the code can be found here.
- Aymen ran the Strong/weak scaling for the ASIFT(PHASE1) on COMET.
- Aymen created Jupyter Notebook contains all of the plots, the hypothesis of the problem and what is the next step. here. password will be shared on slack.
- Giannis further investigates the reasons behind the strong and weak scaling performance numbers (found here).
- Giannis started experimentation with EnTK 0.7.
- Giannis opened PR for Seals EnTK v 0.7 script. PR is here
- Giannis runs strong and weak scaling experiments with 8 pipelines using a single image. Initial timing notebook can be found here. Password will be sent on Slack.
- Aymen developed the ASIFT EnTk script based on the new EntK 0.7. the code is here
- Aymen started strong and weak scaling experiments on Comet with ASIFT.
- Giannis develops Seals EnTK script based EnTK 0.7. Code is here
- Giannis runs strong and weak scaling experiments with 8 pipelines using a single image. Initial timing notebook can be found here. Password will be sent on Slack.
- Giannis created software requirements specification document to be discussed, after syncing with Matteo.
- Aymen developed the ASIFT EnTk script based on the new EntK 0.7. the code is here
- Aymen updated ASIFT EnTK script with the latest version of ASIFTv2 code and runs strong and weak scaling experiments on Comet.
- Giannis finished the development of Seals EnTK script based on ticket #1 and #5 of the Seal Use Case. Code is here
- Giannis tested and executed some initial experiments with data provided by Brad. Initial timing notebook can be found here. Password will be sent on Slack.
- Giannis created development guidelines document to be discussed.
- Aymen created the first stage from the PST model of the ASIFT use case using the RADICAL-EnTK. Waiting for rest of the code. Initial script can be found here. Remote execution is delayed due to RADICAL-Pilot Issue #1638 and SAGA Issue #676
- Aymen initiated test on local computer to complete the full pipeline and test it.
- Aymen is working on building PST model for the ASIFT Use Case, also working on Stage1 (Phase-1 in the Use_Case (Generate GCPs)) using Entk.
- Giannis finished and tested a RADICAL-Pilot script supporting the Seals use case. Waiting for data to run in scale
- Will executed strong and weak scaling experiments with the Blob detector algorithm. Data with RADICAL-Pilot, data with Spark
- Will executed strong and weak scaling experiments with the Watershed algorithm using RADICAL-Pilot (RP). The data from Strong and Weak scaling can be found here. Python Scripts
- Ioannis finished the watershed Spark implementation. Data from Weak and Strong scaling experiments can be found here
- Dask implementation is being debugged. It cannot scale for more than 512 images. Target 4096 images.
- Ioannis created an EnTK prototype script implementing the Seals pipeline as discussed and captured by Brad's document.
- Ioannis and Brad had a hand's on call on how to use EnTK and RADICAL-Pilot.
- Blob detector implementation is left behind mainly due to Watershed Dask implementation.
- Aymen walked through RADICAL-Pilot's and Ensemble Toolkit's documentation. In addition, he is reading for the ASIFT use case
- Will executes strong and weak scaling experiments with the Watershed algorithm using RADICAL-Pilot (RP).
- Dask implementation is ready for testing and experimentation.
- Ioannis created a document discussing how the Seals use case can be implemented using the PST model. Edited by Brad
- Ioannis and Will are moving forward implementing a Blob detection algorithm using RP, Dask and Spark. We will try to use the algorithm that exists in the Seals repo
- Will started experimenting with the Watershed algorithm using RADICAL-Pilot and running experiments on Comet. Along with Ioannis, he will implement it for Dask and Spark.
- Ioannis is identifying algorithms of interest to create Mini-Apps that will allow to characterize these algorithms. Candidates:
- Watershed
- Blob detection
- Digital Elevation Maps
- Jake and Alex working with Bento have prepared a status report and will present
- Ash and Raj are developing UCSB use case. They reported progress to Matteo and Shantenu
- Will is exploring the github, uses cases and examining whether to focus on (i) software engineering and (ii) cyberinfastructure components.
- Ioannis and Will are looking for ways to incorporate details from two papers into the ICEBERG project