diaspara metric database creation script

DIASPARA WP3.2 working document

Creation of metric db, version = build
Author

Briand Cédric, Oliviero Jules, Helminen Jani

Published

26-04-2025

Aside the main db report, which describes all the vocabularies used in this document, we have to build a separate database for metrics. In the project these correspond to LHT, but it’s more, the data should also correspond to the time series and the sampling db developped by WGEEL. The two latter data structure (series and sampling) are very similar and they both hold very similar group metrics and individual metrics.

The first was developed initially to store data about the series used in recruitment. In practice, it consists of three tables, the t_series_ser (Figure Figure 1 - top in blue) table contains series id and description, with columns describing the sampling details, the stage used, the method… This is the main identifier of the series which will be used as a reference in all dependent tables. The second t_dataseries_das table (Figure Figure 1 - on the right) holds data about annual values in series. These are typically annual counts for recruitment, along with additional effort data. Linked to these are group metric series used to describe the series, mean age of eel, mean size, proportion of glass eel among the yellow eels, proportion of females … (Figure Figure 1 - in orange) Finally, we can link individual metrics. The individual metrics are all detailed for one fish. And they concern metrics like size, weight, sex, but also can hold data about quality, contamination. So these are in essence the Life History traits analysed by WP2 in DIASPARA (Figure Figure 1 - in pink).

Figure 1: Diagram for series

The second type of data was developed to hold the data collected for DCF. These can be metrics collected from sampling by the fishermen, data coming from the analysis of electrofishing data, or other experimental sampling that are not reported as series. Currently the two structures for series and sampling are very close, the only difference is that there is no annual number linked to the sampling data, and that they are not linked to a stage in the first table, so the stage is added in the fish table. The difference in table structure is illustrated below in tables highlighted in yellow (Figure Figure 2).

Figure 2: Diagram for sampling

The database development highlighted in the current report has several objectives :

Creating the database structure from WGEEL (TODO)

git issue #23 Write simplified structure from WGEEL

The main issue will require to merge the two table structures (sampling and series) and adapt to migdb vocabulary.

Once done a beta version probably not completely adapted will be released.

milestone metric DB beta version

Import data from WGEEL (TODO)

If this works then the rest should follow smoothly git issue #25 Import data from WGEEL

The release date for that one is :

Milestone release alpha

So the metric release will be after wgeel, but hopefully some of the work will be started and this can be discussed during wgeel.

Import data from WP2 (TODO)

Adapt to ICES format (TODO)

Import to ICES

Acknowledgements

 

EU is not reponsible for the content of the project