.github
└── workflows
└── build.yml
get_urls
├── get_urls.py
└── get_urls_reqs.txt
make_stac
├── make_stac.py
└── make_stac_reqs.txt
config.ymlThe EOAP Generator
Description & purpose: This Notebook introduces the EOAP generation tool created to help users make compliant EO application packages ready to be run using the EODH workflow runner.
Author(s): Alastair Graham, Dusan Figala
Date created: 2024-11-08
Date last modified: 2025-01-07
Licence: This file is licensed under Creative Commons Attribution-ShareAlike 4.0 International. Any included code is released using the BSD-2-Clause license.
Copyright (c) , All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Introduction
One of the Pathfinder delivery partners, Oxidian, has created a tool to help dev-ops specialists or specialist technicians to create compliant EOAPs that will run on the EODH. The eoap-gen tool can be found here: https://github.com/EO-DataHub/eoap-gen
It is described as “a CLI tool for generating Earth Observation Application Packages including CWL workflows and Dockerfiles from user supplied python scripts”.
Requirements
There are three main requirements that are needed for the tool to create a working EOAP. These are: * Python scripts. These must use argparse or click and the parameters will be mapped to the CWL CommandLineTool inputs * A pip requirements file for each script being wrapped into the EOAP * A compliant eoap-gen configuration file
Steps
A full tutorial is provided with the repository (see https://github.com/EO-DataHub/eoap-gen/blob/main/ades_guide.md). Here, we will outline the main steps required in using the eoap-gen tool.
The first thing a user is required to do is understand the workflow that they want to wrap. At it’s most simple the steps of a workflow are threefold: * find your input data, * process your input data, and * create a STAC output of the processed data.
For the eoap-gen tool these steps will always be required and when using the workflow runner (WR) (aka ADES) on the EODH the output will always need to be a directory output containing a STAC catalog. When using the EODH it is recommended that the Python API client pyeodh is used to access the API endpoints on the Hub.
The following directory structure is recommended when using the eoap-gen tool:
Despite simplifying the process, it is still complex to create these packages. A configuration file is needed and this is then used to create the EOAP. More information about this can be found in the repositry for the tool, but the example of a configuration file for a single step workflow (below) demonstrates the need to understand the full data procesisng chain.
id: resize-collection
doc: Resize collection cogs
label: Resize collection cogs
inputs:
- id: catalog
label: catalog
doc: full catalog path
type: string
default: supported-datasets/ceda-stac-fastapi
- id: collection
label: collection id
doc: collection id
type: string
default: sentinel2_ard
outputs:
- id: stac_output
type: Directory
source: step3/stac_catalog
steps:
- id: get_urls
script: playground/get_urls.py
requirements: playground/get_urls_reqs.txt
inputs:
- id: catalog
source: resize-collection/catalog
- id: collection
source: resize-collection/collection
outputs:
- id: urls
type: string[]
outputBinding:
loadContents: true
glob: urls.txt
outputEval: $(self[0].contents.split('\n'))
- id: ids
type: string[]
outputBinding:
loadContents: true
glob: ids.txt
outputEval: $(self[0].contents.split('\n'))Once the required files are in place the user needs to execute the eoap-gen tool. The specific command will change with the file names, but the following code snippet shows the form it would take
eoap-gen generate \
--config=eoap-gen-config.yml \
--output=eoap-gen-out \
--docker-url-base=ghcr.io/user/repo \
--docker-tag=mainOther tools
Other useful tools that you may want to try include:
cwltool
The cwltool is “the reference implementation of the Common Workflow Language open standards. It is intended to be feature complete and provide comprehensive validation of CWL files as well as provide other tools related to working with CWL”. It is a commandline tool designed to run locally and is an excellent piece of software to help check that CWL is compliant. It is designed for use on Linux and will also run on a Mac or Windows (through WSL - windows Subsystem for Linux). It can implement Docker, Podman, Singularity and others for the containerisatoion of commandline components.
scriptcwl
Scriptcwl is a Python package for creating CWL workflows and the latest doscumentation gives an indepth explanation of its use. Be aware that this tool has not been developed on or updated for many years.
cwl-utils
Still actively developed, cwl-utils provides Python utilities and autogenerated classes for loading and parsing CWL documents. Although not specific to EOAPs this set of tools may be helpful when developing your workflows. Documentation is relatively sparse.