# If needed you can install a package in the current AppHub Jupyter environment using pip
# For instance, we will need at least the following libraries
import sys
!{sys.executable} -m pip install --upgrade pyeodh pandas matplotlib numpy pillow folium
Pathfinder Phase Workshop: Processing Data
Description & purpose: This Notebook is designed to showcase the functionality of the Earth Observation Data Hub (EODH) as the project approaches the end of the Pathfinder Phase. It provides a snapshot of the Hub, the pyeodh
API client and the various datasets as of February 2025.
Author(s): Alastair Graham, Dusan Figala
Date created: 2025-02-18
Date last modified: 2025-02-21
Licence: This notebook is licensed under Creative Commons Attribution-ShareAlike 4.0 International. The code is released using the BSD-2-Clause license.
Copyright (c) , All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Presentation set up
The following cell only needs to be run on the EODH AppHub. If you have a local Python environment running, please install the required packages as you would normally.
EODH and EOAP
The EODH compute architecture is built around a collection of OGC standards which when brought together are known as EO Application Packages (EOAP). These are complex constructions of code and data, and at their core is the concept of a Common Workflow Language (CWL) workflow. To run CWL workflows you need a CWL runner, and the EODH Workflow Runner provides that. The EOAPs require a workflow description in CWL, a Docker container, bespoke scripts and links to the data. In the case of EODH, the data inputs and outputs are to be provided as STAC catalogues. Oxidian, as part of our work developing integrations for the Hub, have created a generator tool eoap-gen
that abstracts away much of the complexity (check out the training materials repository and website for more details).
Oxidian have also developed a QGIS plugin to allow desktop users to discover, parameterise and execute workflows on the Hub. We will look at that more at the bottom of this document.
# Imports
import pyeodh
import os
from requests import HTTPError
First the user needs to connect to the Workflow Runner. You will need your user account name (likely your GitHub user name at the time of the workshop) and API key (as obtained earlier in the exercises).
Here, those details have been saved in secrets.txt which contains the two lines below where USERNAME and API_KEY are specific to the user:
- USER=“USERNAME”
- PSWD=“API_KEY”
You can either create a secrets.txt file in the same folder as this Notebook and run the next code cell, OR you can manually copy and paste your details details into the next-but-one code cell and run that. Make sure you save your Notebook after making any edits. Do not share your credentials.
# Make sure secrets.txt exists in the same folder as this Notebook, and is correctly formatted
# If you have the correct permissions you can join different platforms by changing base_url
with open("secrets.txt", "r") as file:
= file.readlines()
lines = lines[0].strip().split("=")[1].strip('"')
username = lines[1].strip().split("=")[1].strip('"')
token
= pyeodh.Client(
clientwfr ="https://eodatahub.org.uk", username=username, token=token
base_url
)= clientwfr.get_ades() wfr
# IGNORE if you ran the previous cell successfully
= "Replace with your username"
username = "Replace with your API token"
token
= pyeodh.Client(
clientwfr ="https://eodatahub.org.uk", username=username, token=token
base_url
)= clientwfr.get_ades() wfr
The next cell demonstrates a simple CWL workflow. Do not change the code in this cell or your workflow will return errors.
The workflow takes in an array of file names. These would be URLs to online resources, so you could imagine obtaining them via a search of a STAC catalogue. It then starts gdal in a Docker container and passes the URLs into the container. gdal_translate is then used to resize the resolution of the image pixels to 5% of their original. The outputs are passed to another Docker container that creates a STAC compatible output and stores it in an AWS S3 bucket attached to your workspace.
As you can see, in the case of simple workflows, it is likely easier to use the base commands. However, if you want to generate STAC outputs, or run any workflow where it acts as a service, or need connections between platforms, then there are situations where EOAPs can be beneficial.
= r"""$graph:
cwl_yaml - class: CommandLineTool
id: 'resize_make_stac'
inputs:
- id: 'files'
doc: FILES
type:
type: array
items: File
inputBinding: {}
requirements:
- class: DockerRequirement
dockerPull: ghcr.io/eo-datahub/user-workflows/resize_make_stac:main
- class: InlineJavascriptRequirement
doc: "None\n"
baseCommand:
- python
- /app/app.py
outputs:
- id: 'stac_catalog'
outputBinding:
glob: .
type: Directory
- class: CommandLineTool
inputs:
- id: 'url'
inputBinding:
position: 1
prefix: /vsicurl/
separate: false
type: string
- id: 'fname'
inputBinding:
position: 2
valueFrom: $(inputs.url.split('/').pop() + "_resized.tif")
type: string
- id: 'outsize_x'
inputBinding:
position: 4
prefix: -outsize
separate: true
type: string
- id: 'outsize_y'
inputBinding:
position: 5
type: string
outputs:
- type: File
outputBinding:
glob: '*.tif'
id: 'resized'
requirements:
- class: DockerRequirement
dockerPull: ghcr.io/osgeo/gdal:ubuntu-small-latest
- class: InlineJavascriptRequirement
baseCommand: gdal_translate
id: 'resize_process'
- class: Workflow
id: 'resize-urls'
inputs:
- id: 'urls'
label: urls
doc: urls
type:
type: array
items: string
- id: 'outsize_x'
label: outsize_x
doc: outsize_x
default: 5%
type: string
- id: 'outsize_y'
label: outsize_y
doc: outsize_y
default: 5%
type: string
outputs:
- id: 'stac_output'
outputSource:
- 'resize_make_stac/stac_catalog'
type: Directory
requirements:
- class: ScatterFeatureRequirement
- class: StepInputExpressionRequirement
- class: InlineJavascriptRequirement
label: Resize urls
doc: Resize urls
steps:
- id: 'resize_process'
in:
- id: 'outsize_x'
source: 'outsize_x'
- id: 'outsize_y'
source: 'outsize_y'
- id: 'url'
source: 'urls'
- id: 'fname'
valueFrom: $(inputs.url.split('/').pop() + "_resized.tif")
out:
- id: 'resized'
run: '#resize_process'
scatter:
- 'url'
scatterMethod: dotproduct
- id: 'resize_make_stac'
in:
- id: 'files'
source: 'resize_process/resized'
out:
- id: 'stac_catalog'
run: '#resize_make_stac'
cwlVersion: v1.0
"""
If you try to deploy a workflow that already exists (i.e. with the same name) in the EODH, you will get an error. It is therefore good practice to run a check and delete any process with the same name - assuming you are confident that you want to overwrite it. Otherwise, consider renaming your workflow.
try:
"resize-urls").delete()
wfr.get_process(except Exception:
print("Process not found, no need to undeploy.")
Process not found, no need to undeploy.
# Deploy the CWL workflow code to the workflow runner
= wfr.deploy_process(cwl_yaml=cwl_yaml) process
If you want to know what workflows you have access to in your workspace on the Hub, the following code will create a print out.
# List the workflows available in the user workspace
for p in wfr.get_processes():
print(p.id)
The following code executes the workflow for the two dataset URLs listed
Note: If you get a 409 error this usually means the process already exists and you need to handle the existing process by either calling process.delete() first then deploying again, or calling process.update()
= wfr.get_process("resize-urls").execute(
myjob
{"urls": [
"https://dap.ceda.ac.uk/neodc/sentinel_ard/data/sentinel_2/2023/11/20/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif",
"https://dap.ceda.ac.uk/neodc/sentinel_ard/data/sentinel_2/2023/11/20/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif",
],"outsize_x": "5%",
"outsize_y": "5%",
}
)print("job submitted")
job submitted
We can check the status of the job using attributes on the job object:
myjob.refresh()print("Status: ", myjob.status)
print("Message: ", myjob.message)
print(myjob.self_href)
Status: successful
Message: ZOO-Kernel successfully run your service!
https://eodatahub.org.uk/api/catalogue/stac/catalogs/user/catalogs/myws/jobs/8be7184e-4148-11f0-b446-d2c52c451fa3
If you follow the URL output from the previous cell you will see something similar to the following. The two immediately helpful bits of information are the jobID
and status
(hopefully it’s successful!). Ypou may need to refresh the web page if the workflow takes a while to cmplete. You will need to be logged in to your EODH account to be able to view this file and any links from it as these are tied to your workspace. as you can hopefully see from the cell below, there are logs to each of the stages of the workflow, all of which are accessible.
{"id": "8be7184e-4148-11f0-b446-d2c52c451fa3",
"jobID": "8be7184e-4148-11f0-b446-d2c52c451fa3",
"type": "process",
"processID": "resize-urls",
"created": "2025-06-04T13:33:49.254Z",
"started": "2025-06-04T13:33:49.254Z",
"finished": "2025-06-04T13:35:45.932Z",
"updated": "2025-06-04T13:35:45.651Z",
"status": "successful",
"message": "ZOO-Kernel successfully run your service!",
"links": [
{"title": "Status location",
"rel": "monitor",
"type": "application/json",
"href": "https://eodatahub.org.uk/api/catalogue/stac/catalogs/user/catalogs/myws/jobs/8be7184e-4148-11f0-b446-d2c52c451fa3",
},
{"title": "Result location",
"rel": "http://www.opengis.net/def/rel/ogc/1.0/results",
"type": "application/json",
"href": "https://eodatahub.org.uk/api/catalogue/stac/catalogs/user/catalogs/myws/jobs/8be7184e-4148-11f0-b446-d2c52c451fa3/results",
},
{"href": "https://eodatahub.org.uk/api/catalogue/stac/catalogs/user/catalogs/myws/jobs/8be7184e-4148-11f0-b446-d2c52c451fa3/resize_process.log",
"title": "Tool log resize_process.log",
"rel": "related",
"type": "text/plain",
},
{"href": "https://eodatahub.org.uk/api/catalogue/stac/catalogs/user/catalogs/myws/jobs/8be7184e-4148-11f0-b446-d2c52c451fa3/resize_process_2.log",
"title": "Tool log resize_process_2.log",
"rel": "related",
"type": "text/plain",
},
{"href": "https://eodatahub.org.uk/api/catalogue/stac/catalogs/user/catalogs/myws/jobs/8be7184e-4148-11f0-b446-d2c52c451fa3/resize_make_stac.log",
"title": "Tool log resize_make_stac.log",
"rel": "related",
"type": "text/plain",
},
{"href": "https://eodatahub.org.uk/api/catalogue/stac/catalogs/user/catalogs/myws/jobs/8be7184e-4148-11f0-b446-d2c52c451fa3/node_stage_out.log",
"title": "Tool log node_stage_out.log",
"rel": "related",
"type": "text/plain",
},
], }
Following the final URL (node_stage_out.log in the case above) should take you to a detailed log of the full process. (For purposes of this notebook, an internal request method of the client is used to fetch the log.) A selection of that log is shown in the cell below. From here it is possible to access the resized output files (ending in _resized.tif
) and download them. They are also catalogued in STAC so you can use pyeodh to access them, or you can use the Catalogue user interface and look under User Datasets
for a catalogue with the same Job_ID
and find links to the outputs that way.
= next(
stage_out_url for link in myjob.links if link.title == "Tool log node_stage_out.log"
link.href
)
= clientwfr._request_raw("Get", stage_out_url)
resp
print(resp.text)
2025-06-04T13:35:21.462497Z - node-stage-out-pod-wtqordyt - Running stageout with /var/lib/cwl/stga52a6749-e3a8-42ca-bb1f-6b2068fcbd44/_7_ah2mt workspaces-eodhp processing-results/resize-urls 8be7184e-4148-11f0-b446-d2c52c451fa3 resize-urls https://eodhp/catalogs/user/catalogs/myws/jobs/8be7184e-4148-11f0-b446-d2c52c451fa3
2025-06-04T13:35:21.757825Z - node-stage-out-pod-wtqordyt - No Collections found, generating new collection
2025-06-04T13:35:21.761511Z - node-stage-out-pod-wtqordyt - Upload asset /tmp/outputs/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.tif to workspaces-eodhp at myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.tif
2025-06-04T13:35:21.966891Z - node-stage-out-pod-wtqordyt - Replacing item href with https://myws.eodatahub-workspaces.org.uk/files/workspaces-eodhp/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.tif
2025-06-04T13:35:21.967640Z - node-stage-out-pod-wtqordyt - Upload asset /tmp/outputs/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.tif to workspaces-eodhp at myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.tif
2025-06-04T13:35:22.067940Z - node-stage-out-pod-wtqordyt - Replacing item href with https://myws.eodatahub-workspaces.org.uk/files/workspaces-eodhp/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.tif
2025-06-04T13:35:22.068412Z - node-stage-out-pod-wtqordyt - Upload collection.json to https://myws.eodatahub-workspaces.org.uk/files/workspaces-eodhp/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3.json
2025-06-04T13:35:22.124622Z - node-stage-out-pod-wtqordyt - Writing to https://myws.eodatahub-workspaces.org.uk/files/workspaces-eodhp/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3.json
2025-06-04T13:35:22.124695Z - node-stage-out-pod-wtqordyt - Uploading to myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3.json at workspaces-eodhp
2025-06-04T13:35:22.249939Z - node-stage-out-pod-wtqordyt - Upload S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized to s3://workspaces-eodhp/myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json
2025-06-04T13:35:22.363070Z - node-stage-out-pod-wtqordyt - Writing to s3://workspaces-eodhp/myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json
2025-06-04T13:35:22.363141Z - node-stage-out-pod-wtqordyt - Uploading to myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json at workspaces-eodhp
2025-06-04T13:35:22.475474Z - node-stage-out-pod-wtqordyt - Upload S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized to s3://workspaces-eodhp/myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json
2025-06-04T13:35:22.531831Z - node-stage-out-pod-wtqordyt - Writing to s3://workspaces-eodhp/myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json
2025-06-04T13:35:22.531906Z - node-stage-out-pod-wtqordyt - Uploading to myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json at workspaces-eodhp
2025-06-04T13:35:22.630253Z - node-stage-out-pod-wtqordyt - Generated new collection and uploaded to S3
Upload catalog.json to https://myws.eodatahub-workspaces.org.uk/files/workspaces-eodhp/processing-results/resize-urls/catalog.json
2025-06-04T13:35:22.687977Z - node-stage-out-pod-wtqordyt - Writing to https://myws.eodatahub-workspaces.org.uk/files/workspaces-eodhp/processing-results/resize-urls/catalog.json
Uploading to myws/processing-results/resize-urls/catalog.json at workspaces-eodhp
2025-06-04T13:35:22.871396Z - node-stage-out-pod-wtqordyt - Upload new Catalog resize-urls to https://myws.eodatahub-workspaces.org.uk/files/workspaces-eodhp/processing-results/resize-urls.json
2025-06-04T13:35:23.018524Z - node-stage-out-pod-wtqordyt - Writing to https://myws.eodatahub-workspaces.org.uk/files/workspaces-eodhp/processing-results/resize-urls.json
2025-06-04T13:35:23.018624Z - node-stage-out-pod-wtqordyt - Uploading to myws/processing-results/resize-urls.json at workspaces-eodhp
2025-06-04T13:35:23.146271Z - node-stage-out-pod-wtqordyt - STAC outputs found, sending to ingester
Now harvesting files
Harvesting file myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json
2025-06-04T13:35:23.168479Z - node-stage-out-pod-wtqordyt - Uploading file to workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls/catalogs/catalog/collections/col_8be7184e-4148-11f0-b446-d2c52c451fa3/items/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json
2025-06-04T13:35:23.198019Z - node-stage-out-pod-wtqordyt - File myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json harvested successfully
2025-06-04T13:35:23.198121Z - node-stage-out-pod-wtqordyt - Harvesting file myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json
2025-06-04T13:35:23.220392Z - node-stage-out-pod-wtqordyt - Uploading file to workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls/catalogs/catalog/collections/col_8be7184e-4148-11f0-b446-d2c52c451fa3/items/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json
2025-06-04T13:35:23.244825Z - node-stage-out-pod-wtqordyt - File myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json harvested successfully
Harvesting file myws/processing-results/resize-urls/catalog.json
2025-06-04T13:35:23.267997Z - node-stage-out-pod-wtqordyt - Uploading file to workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls/catalogs/catalog.json
2025-06-04T13:35:23.295110Z - node-stage-out-pod-wtqordyt - File myws/processing-results/resize-urls/catalog.json harvested successfully
Harvesting file myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3.json
2025-06-04T13:35:23.312629Z - node-stage-out-pod-wtqordyt - Uploading file to workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls/catalogs/catalog/collections/col_8be7184e-4148-11f0-b446-d2c52c451fa3.json
2025-06-04T13:35:23.340390Z - node-stage-out-pod-wtqordyt - File myws/processing-results/resize-urls/catalog/col_8be7184e-4148-11f0-b446-d2c52c451fa3.json harvested successfully
2025-06-04T13:35:23.340485Z - node-stage-out-pod-wtqordyt - Harvesting file myws/processing-results/resize-urls.json
2025-06-04T13:35:23.360570Z - node-stage-out-pod-wtqordyt - Uploading file to workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls.json
2025-06-04T13:35:23.393955Z - node-stage-out-pod-wtqordyt - File myws/processing-results/resize-urls.json harvested successfully
['workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls/catalogs/catalog/collections/col_8be7184e-4148-11f0-b446-d2c52c451fa3/items/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json', 'workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls/catalogs/catalog/collections/col_8be7184e-4148-11f0-b446-d2c52c451fa3/items/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json', 'workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls/catalogs/catalog.json', 'workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls/catalogs/catalog/collections/col_8be7184e-4148-11f0-b446-d2c52c451fa3.json', 'workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls.json']
2025-06-04T13:35:23.394022Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.392 INFO [140481365014336] ClientConnection:193 | [<none> -> pulsar://pulsar-proxy.pulsar:6650] Create ClientConnection, timeout=10000
2025-06-04T13:35:23.394056Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.392 INFO [140481365014336] ConnectionPool:124 | Created connection for pulsar://pulsar-proxy.pulsar:6650-pulsar://pulsar-proxy.pulsar:6650-0
2025-06-04T13:35:23.397322Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.395 INFO [140481269921536] ClientConnection:410 | [10.0.77.64:56596 -> 172.20.133.142:6650] Connected to broker
2025-06-04T13:35:23.410705Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.408 INFO [140481269921536] HandlerBase:115 | [persistent://public/default/harvested, ades-outputs-stac-8be7184e-4148-11f0-b446-d2c52c451fa3] Getting connection from pool
2025-06-04T13:35:23.411682Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.409 INFO [140481269921536] BinaryProtoLookupService:85 | Lookup response for persistent://public/default/harvested, lookup-broker-url pulsar://pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650, from [10.0.77.64:56596 -> 172.20.133.142:6650]
2025-06-04 13:35:23.409 INFO [140481269921536] ClientConnection:193 | [<none> -> pulsar://pulsar-proxy.pulsar:6650] Create ClientConnection, timeout=10000
2025-06-04 13:35:23.409 INFO [140481269921536] ConnectionPool:124 | Created connection for pulsar://pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650-pulsar://pulsar-proxy.pulsar:6650-0
2025-06-04T13:35:23.419866Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.418 INFO [140481269921536] ClientConnection:412 | [10.0.77.64:56606 -> 172.20.133.142:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650, proxy: , physical address:pulsar://pulsar-proxy.pulsar:6650
2025-06-04T13:35:23.426325Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.424 INFO [140481269921536] ProducerImpl:147 | Creating producer for topic:persistent://public/default/harvested, producerName:ades-outputs-stac-8be7184e-4148-11f0-b446-d2c52c451fa3 on [10.0.77.64:56606 -> 172.20.133.142:6650]
2025-06-04T13:35:23.431732Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.430 INFO [140481269921536] ProducerImpl:220 | [persistent://public/default/harvested, ades-outputs-stac-8be7184e-4148-11f0-b446-d2c52c451fa3] Created producer on broker [10.0.77.64:56606 -> 172.20.133.142:6650]
2025-06-04T13:35:23.431804Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.430 INFO [140481269921536] HandlerBase:138 | Finished connecting to broker after 21 ms
2025-06-04T13:35:23.432328Z - node-stage-out-pod-wtqordyt - Sending message via pulsar ...
{"id": "8be7184e-4148-11f0-b446-d2c52c451fa3", "workspace": "myws", "source-uri": "https://eodhp/catalogs/user/catalogs/myws/jobs/8be7184e-4148-11f0-b446-d2c52c451fa3", "bucket_name": "workspaces-eodhp", "updated_keys": [], "added_keys": ["workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls/catalogs/catalog/collections/col_8be7184e-4148-11f0-b446-d2c52c451fa3/items/S2A_20231120_latn501lonw0036_T30UVA_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json", "workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls/catalogs/catalog/collections/col_8be7184e-4148-11f0-b446-d2c52c451fa3/items/S2A_20231120_latn519lonw0037_T30UVC_ORB037_20231120132420_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif_resized.json", "workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls/catalogs/catalog.json", "workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls/catalogs/catalog/collections/col_8be7184e-4148-11f0-b446-d2c52c451fa3.json", "workflow-harvester/8be7184e-4148-11f0-b446-d2c52c451fa3/myws/catalogs/processing-results/catalogs/resize-urls.json"], "deleted_keys": [], "source": "8be7184e-4148-11f0-b446-d2c52c451fa3/", "target": "catalogs/user/catalogs/"}
2025-06-04T13:35:23.439374Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.437 INFO [140481365014336] ProducerImpl:800 | [persistent://public/default/harvested, ades-outputs-stac-8be7184e-4148-11f0-b446-d2c52c451fa3] Closing producer for topic persistent://public/default/harvested
2025-06-04T13:35:23.440670Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.439 INFO [140481269921536] ProducerImpl:764 | [persistent://public/default/harvested, ades-outputs-stac-8be7184e-4148-11f0-b446-d2c52c451fa3] Closed producer 0
2025-06-04T13:35:23.441120Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.439 INFO [140481365014336] ClientImpl:665 | Closing Pulsar client with 0 producers and 0 consumers
2025-06-04T13:35:23.441574Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.439 INFO [140481210824448] ClientConnection:1334 | [10.0.77.64:56606 -> 172.20.133.142:6650] Connection disconnected (refCnt: 2)
2025-06-04T13:35:23.441624Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.439 INFO [140481210824448] ClientConnection:1334 | [10.0.77.64:56596 -> 172.20.133.142:6650] Connection disconnected (refCnt: 2)
2025-06-04T13:35:23.441675Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.440 INFO [140481365014336] ProducerImpl:757 | Producer - [persistent://public/default/harvested, ades-outputs-stac-8be7184e-4148-11f0-b446-d2c52c451fa3] , [batching = off]
2025-06-04T13:35:23.442341Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.440 INFO [140481365014336] ClientConnection:282 | [10.0.77.64:56606 -> 172.20.133.142:6650] Destroyed connection to pulsar://pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650-0
2025-06-04T13:35:23.442414Z - node-stage-out-pod-wtqordyt - 2025-06-04 13:35:23.440 INFO [140481365014336] ClientConnection:282 | [10.0.77.64:56596 -> 172.20.133.142:6650] Destroyed connection to pulsar://pulsar-proxy.pulsar:6650-0
The difference in the input (top) and output (bottom) (for this cloudy image!) is shown below:
As development continues the number of demonstration workflows available to users should increase. With uptake of the generator tool, users will also be able to create their own bespoke workflows. In time, the Hub will contain organisational accounts and the ability to share workflow files.
If a user wants to deploy a workflow that they have found online, they can do so for compliant CWL files by submitting the URL (as shown below).
n.b. you need the raw content link of the workflow resource. The resource can be found here and a repository of examples here
# Deploy a workflow using a URL to a .cwl file hosted online
= wfr.deploy_process(
convert_url_proc ="https://raw.githubusercontent.com/EO-DataHub/eodhp-ades-demonstration/e4ed00b16add58f9b516921ca9ca9e1068dd27c1/convert-url-app.cwl"
cwl_url
)print(convert_url_proc.id, convert_url_proc.description, ": Deployed")
print("")
# List the available workflows
for p in wfr.get_processes():
print(p.id)
# Tidy up youe Workflow Runner userspace
try:
"resize-urls").delete()
wfr.get_process("convert-url").delete()
wfr.get_process(except Exception:
print("Process not found, no need to undeploy.")
If needed for tracking the progress of a workflow, you can enable verbose logging using th efollowing code. Set this in the same cell as the import statements, at the end of the cell.
pyeodh.set_log_level(10)
Other interactions with workflows
Screenshots are provided here for the QGIS Plugin. If you are a user of QGIS the plugin is now available in the Plugin Repository and can be installed in the usual way.
Once a CWL file has been deployed to a user’s workspace (or in future, to organisational or shared workspaces) then it is possible for users with suitable permissions to view their list of available workflows through the QGIS plugin.
The user can choose and parameterise a workflow using the QGIS plugin as shown in this screenshot. A series of defaults are provided with all workflow files so a user can run the workflow straight away using those. To do so in code, the user provides an empty dictionary.
The status of a running job: "status": "running"
can be monitored using the QGIS plugin.
The outputs are accessible and loadable from the QGIS plugin.
Original image:
Resampled image:
The workflow outputs are available in a users’s workspace and are attached to the job id
presented through the QGIS plugin. A user can open the workflow output from within the plugin to be displayed in the QGIS map window.