Return to ENVRI Community Home
Particle formation is an atmospheric process whereby at specific spatial locations aerosol particles form and grow in diameter size over the course of a few hours. Particle formation is studied for its role in climate change and human respiratory health.
The use case aims to, primarily, (1) harmonize the information describing particle formation; (2) represent information, specifically the meaning of data, using an appropriate computer language; and (3) acquire and curate information in infrastructure.
|Background||Contact Person||Organization||Contact email|
|ICT||Markus Stocker||TIB, PANGAEAfirstname.lastname@example.org|
|RI-Domain||Jaana Bäck||University of Helsinkiemail@example.com|
|e-Infrastructure||Yann Le Franc||EUDATfirstname.lastname@example.org|
|ICT||Robert Huber||UniHB, PANGAEAemail@example.com|
Data Use, Data Acquisition (primarily)
Data Publication (possibly)
Relevant Data Use Community Behaviors
Relevant Data Use Community Roles
Section 1.1 provides a summary of the primary aims of this use case. We begin this section by providing a more detailed description of the aims. Where applicable, we discuss how these aims align with FAIR Principles (Wilkinson, 2016). Aims marked optional will be addressed if time permits.
There exist multiple, institutionally and geographically distributed, research groups that perform the scientific task of interpreting particle size distribution observational data to detect and characterize the occurrence of particle formation at determinate spatiotemporal locations. Two groups well-known to the authors of this use case are the Atmospheric Aerosol Physics research group at the University of Eastern Finland and the Aerosol Cloud Climate Interactions research group at the University of Helsinki.
The second impact is the possible systematic acquisition and curation of explicit and formal (i.e., machine actionable) meaning of data (in addition to the data themselves). Rather than merely acquiring data products in form of, e.g., visualizations such as maps or plots (with implicit information content not available to machines) this use case aims to set an example for how infrastructures can systematically acquire and curate truthful, meaningful, well-formed data (i.e., information) whereby meaning is explicit and formal. Furthermore, we expect that harmonized information generated by distributed research groups will be easier to acquire for infrastructure, and thus curate and possibly publish. As such, the use case contributes to advancing infrastructures from the current data systems to information and knowledge-based systems (Stocker, 2017) that manage information about natural worlds and their phenomena of interest (in addition to information about people, organizations, instrumentation, publications, etc.).
A key challenge is to bring together representatives of the research community studying particle formation and come to an agreement for how to harmonize the information describing particle formation. It is unclear whether such agreement is desired and achievable. At this stage it is also unclear whether the required people can be motivated to attend the planned workshop.
A third difficulty is the lack of clarity for whether it is possible for infrastructure to systematically acquire, curate and potentially publish the information describing particle formation as envisioned in this use case.
The basic scenario is for research groups, specifically individual researchers, of the atmospheric aerosol particle formation research community to be served with a service that implements a scientific workflow for particle size distribution observational data interpretation and the systematic acquisition, curation and possible publishing of information describing particle formation, resulting from observational data interpretation.
Of interest to advanced scenarios is also the possibility to openly publish information describing particle formation as well as the support for functionality relevant to data publishing, such as persistent identification and citation of information describing particle formation.
The required components are Jupyter, the implementation of the scientific workflow as a Jupyter Notebook, an RDF database with SPARQL endpoint, as well as a Python library with specialized functions. Figure 2 shows a visualization of the prototype implementation of the scientific workflow. The components are containerized using Docker and can easily be deployed on infrastructures such as EGI. Indeed, this has already been tested with the deployment at http://22.214.171.124:8888. Recently, we have adopted JupyterHub in order to support authentication of multiple users and management of individual notebooks.
Overall, the use case is arguably already in a fairly advanced stage. While further technical advances are possible, the more critical advancements now rely on collaborative work with the research community, such as achieving agreement on representing information describing particle formation and adoption of the scientific workflow as a service.
We envision the following implementation plan. First, we plan to organize the aforementioned workshop during Q1 2018 and hold the workshop during Q2 2018, possibly in April ahead of the next ENVRIweek, which would allow for presenting results on aims 1-3 during ENVRIweek. The successful execution of the workshop is a milestone for this use case.
Finally, linking the scientific workflow with the ENVRIplus Knowledge Base in order to support selection of observational data sources and, possibly, automated retrieval of data required in workflow execution also relates to Theme 2 activities and the implementation may serve as a demonstrator in this context.
The use case expects the following (primary) outputs:
Floridi, L. (2011). The Philosophy of Information. Oxford University Press.