.github
└── workflows
└── build.yml
get_urls
├── get_urls.py
└── get_urls_reqs.txt
make_stac
├── make_stac.py
└── make_stac_reqs.txt config.yml
The EOAP Generator
Description & purpose: This Notebook introduces the EOAP generation tool created to help users make compliant EO application packages ready to be run using the EODH workflow runner.
Author(s): Alastair Graham, Dusan Figala
Date created: 2024-11-08
Date last modified: 2025-01-07
Licence: This file is licensed under Creative Commons Attribution-ShareAlike 4.0 International. Any included code is released using the BSD-2-Clause license.
Copyright (c) , All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Introduction
One of the Pathfinder delivery partners, Oxidian, has created a tool to help dev-ops specialists or specialist technicians to create compliant EOAPs that will run on the EODH. The eoap-gen tool can be found here: https://github.com/EO-DataHub/eoap-gen
It is described as “a CLI tool for generating Earth Observation Application Packages including CWL workflows and Dockerfiles from user supplied python scripts”.
Requirements
There are three main requirements that are needed for the tool to create a working EOAP. These are: * Python scripts. These must use argparse or click and the parameters will be mapped to the CWL CommandLineTool
inputs * A pip
requirements file for each script being wrapped into the EOAP * A compliant eoap-gen
configuration file
Steps
A full tutorial is provided with the repository (see https://github.com/EO-DataHub/eoap-gen/blob/main/ades_guide.md). Here, we will outline the main steps required in using the eoap-gen
tool.
The first thing a user is required to do is understand the workflow that they want to wrap. At it’s most simple the steps of a workflow are threefold: * find your input data, * process your input data, and * create a STAC output of the processed data.
For the eoap-gen
tool these steps will always be required and when using the workflow runner (WR) (aka ADES) on the EODH the output will always need to be a directory output containing a STAC catalog. When using the EODH it is recommended that the Python API client pyeodh
is used to access the API endpoints on the Hub.
The following directory structure is recommended when using the eoap-gen
tool:
Despite simplifying the process, it is still complex to create these packages. A configuration file is needed and this is then used to create the EOAP. More information about this can be found in the repositry for the tool, but the example of a configuration file for a single step workflow (below) demonstrates the need to understand the full data procesisng chain.
id: resize-collection
doc: Resize collection cogs
label: Resize collection cogs
inputs:- id: catalog
label: catalog
doc: full catalog pathtype: string
-datasets/ceda-stac-fastapi
default: supported- id: collection
id
label: collection id
doc: collection type: string
default: sentinel2_ard
outputs:- id: stac_output
type: Directory
/stac_catalog
source: step3
steps:- id: get_urls
/get_urls.py
script: playground/get_urls_reqs.txt
requirements: playground
inputs:- id: catalog
-collection/catalog
source: resize- id: collection
-collection/collection
source: resize
outputs:- id: urls
type: string[]
outputBinding:
loadContents: true
glob: urls.txtself[0].contents.split('\n'))
outputEval: $(- id: ids
type: string[]
outputBinding:
loadContents: true
glob: ids.txtself[0].contents.split('\n')) outputEval: $(
Once the required files are in place the user needs to execute the eoap-gen tool. The specific command will change with the file names, but the following code snippet shows the form it would take
-gen generate \
eoap--config=eoap-gen-config.yml \
--output=eoap-gen-out \
--docker-url-base=ghcr.io/user/repo \
--docker-tag=main
Other tools
Other useful tools that you may want to try include:
cwltool
The cwltool is “the reference implementation of the Common Workflow Language open standards. It is intended to be feature complete and provide comprehensive validation of CWL files as well as provide other tools related to working with CWL”. It is a commandline tool designed to run locally and is an excellent piece of software to help check that CWL is compliant. It is designed for use on Linux and will also run on a Mac or Windows (through WSL - windows Subsystem for Linux). It can implement Docker, Podman, Singularity and others for the containerisatoion of commandline components.
scriptcwl
Scriptcwl is a Python package for creating CWL workflows and the latest doscumentation gives an indepth explanation of its use. Be aware that this tool has not been developed on or updated for many years.
cwl-utils
Still actively developed, cwl-utils provides Python utilities and autogenerated classes for loading and parsing CWL documents. Although not specific to EOAPs this set of tools may be helpful when developing your workflows. Documentation is relatively sparse.