After recovering the scientific sources from WOS, relevant scientific publications from which data were extracted were selected according to a screening process (Fig. 2). First, publications must be dedicated to experiments on kitchen garden crops. Although cereals are excluded from selection, an exception has been made for sweetcorn which is relatively common in kitchen gardens. Then metallic elements investigated in publications must be one of the 14 elements listed in BAPPET, i.e. As, Cd, Co, Cr, Cu, Hg, Mo, Ni, Pb, Sb, Se, Tl, V, Zn. Cadmium (Cd), lead (Pb) and zinc (Zn) are the most represented metal in the dataset with more than 50% of the soil/plant pairings dedicated to the study of their transfer (Fig. 3). Only experiments carried out on soils are selected. Those in hydroponic conditions or with direct use of substrates such as compost or biosolids without soil are excluded from the selection. Another condition for publications to be included in the dataset is that they must be able to provide data on concentrations of metal pollutants in plants and soils or bioconcentration factors (BCF). These results must be presented species by species and concern only edible parts of the plant. Thus, data from a mix of different plant species or from the entire plant as well as those derived from models are rejected from the selection. To enable relevant comparisons with user data, publications contributing to BAPPET must explicitly state whether results are reported in fresh or dry weight.

Fig. 2figure 2

Decision-making flowchart for data implementation in the BAPPET dataset. BCF stands for bioconcentration factor.

Fig. 3figure 3

Distribution of soil/plant pairings across the different MTE listed in the BAPPET dataset. The labels indicate the percentage of soil/plant pairings for which the MTE has been entered.

At the end of this selection process, 528 publications were recorded in BAPPET including 37 experimental reports from French public institutes, 2 site diagnostic reports coming from private companies and 3 thesis manuscripts.

Description of data extracted and standardization

Various parameters influencing the transfer of MTE from the growth medium to the plant as well as information on the context of experimentations, sampling and analysis procedures were extracted from the selected publications and recorded in BAPPET (Fig. 1). To enable comparisons between heterogeneous published results, data standardization was performed. All these parameters and standardization process are described in this section.

The two first columns of the dataset inform on the experiment number and bibliographic reference code. A unique experiment number is assigned to each soil/plant pairing defining as a single plant species or variety or edible part grown in one single condition. A unique reference code, made with the three first letters of author’s name and year of publication, is assigned to each source document.

The following parameters concern the nature of the analysed MTE and the MTE speciation. Normalized data entered from a current list of 14 MTE referred to by their chemical symbol: As, Cd, Co, Cr, Cu, Hg, Mo, Ni, Pb, Sb, Se, Tl, V, Zn. The MTE ‘speciation’ column is a free text entry indicating the oxidation number (e.g. Cr(VI)) or the complex formula (e.g. Na2SeO3) of the compound.

The next 26 parameters describe the studied plant and associated metadata. First, the category of the edible part analysed is indicated (‘Plant type’), using a standardized list of 10 plant types based on the consumed part: leaf-, fruit-, root-, tuber-, stem-, bulb-, inflorescent-vegetable, legume, fruit and herb (Tables 1, 2). Leaf vegetable category is the most represented in the dataset accounting for a third of the data (Table 1).

Then, plant identification is provided through common names in French and in English (‘Species (en)’), as well as scientific Latin names (‘Species (lat)’). Since the same kitchen garden crop can be referred to by multiple vernacular and botanical names among different source documents, plant names were standardized employing the nomenclature from the European and Mediterranean Plant Protection Organization (EPPO) Global Database6. Taxonomic rank below species (subspecies, variety, cultivar entered as free text under ‘Plant variety’) are also included when specified in the source document.

Other information concerns plant sampling with i) the sampling stage, indicating the duration of cultivation (in weeks or months) before sampling of the plant, ii) the edible organ analysed, iii) whether the plant had reached maturity (Yes or No) according to the authors’ indications, and iv) and sampling effort, i.e. number of samples analysed. Information on the plant preparation (washing/peeling) before MTE analysis is also recorded.

Descriptive statistics of plant MTE concentration in edible part analysed, expressed in mg of element per kg of plant (mg/kg) after unit conversion (when necessary) are available in the dataset including mean, standard deviation, minimum, maximum, median, geometric standard deviation. A column (‘DW or FW’) indicates whether MTE concentrations are expressed by dry or fresh plant weigh. Detection and quantification limits of the apparatus (LOD/LOQ, given in mg/kg) are included when they were provided in the source document. If calculated in the source document as concentration of MTE in the edible part of the plant divided by the concentration of MTE in the soil, bioconcentration factors are also recorded in the dataset with corresponding statistics: mean, standard deviation, minimum and maximum. These parameters (plant MTE concentration and BCF) correspond to free decimal number entry according to information extracted from the source document. In the case of graphical data, the online tool WebPlotDigitizer7 was used to transform it into digital data.

The dry matter content of the plant is also included in the dataset when provided, after systematically unit conversion in % (‘DW (%)’).

The dataset also compiles model equation and coefficient of determination R² when MTE concentrations in plant are modelled with soil parameters in the original study.

The next 13 parameters concern characteristics of soil that means plant growing media. The BAPPET dataset first provides the general description of soil and subsoil based on details given in the source document including nature, texture and land uses. More precisely, proportion of clay sized particles is detailed with mean, minimum and maximum expressed in %. Average amounts of sand and silt size particles are also provided in %. Values originally expressed in g/kg in the source documents were converted in % for each class of particles.

Another characteristic on the soil of each experimentation available in the dataset is the pH described by average, minimum and maximum values. Soil organic matter and organic carbon amounts are also mentioned in the dataset, both expressed in % after unit conversion when necessary. Additionally, the BAPPET dataset includes soil Cationic Exchange Capacity (CEC) expressed in cmol+/kg. CEC values originally expressed in meq/100 g in the source documents were converted accordingly.

Finally, information on soil sampling time is provided through normalized data indicating whether sampling occurred i) before crop planting, ii) during crop growing, iii) at crop harvesting, and iv) after crop harvesting.

The fifth part of the dataset gathers data describing the medium identified as the potentially source of plant contamination. The first parameter specifies the nature of the investigated medium (i.e. soil, irrigation water, groundwater, runoff water, or air) and the unit of MTE concentrations measured in this medium specified in brackets. MTE concentrations are expressed (after unit conversion of the original data when necessary) in mg/kg or µg/kg for soil, mg/L for water, and mg/m3 for air.

The two next columns (‘Extraction’ and ‘Extractants’) inform respectively on the MTE pool investigated (i.e. total, pseudo total, available, bound to a specific chemical compound of soil or leached) and the used methods and chemical extractants to access these MTE pools.

As for plants, the concentration of MTE in the investigated medium are statistically described in the dataset with mean, standard deviation, minimum, maximum and median. The online tool WebPlotDigitizer7 was also used to extract numerical data from graphs.

When provided in the original study, the detection and quantification limit of the apparatus (LOD/LOQ, expressed in mg/kg) used to analyze MTE concentration in the medium is also included in the dataset.

The next columns cluster informs on the background of the experimentation. First details provide information on the experiment type according to these following standardized modalities: indoor or outdoor (in field or kitchen garden or pots), with or without shelter.

The environmental context described in the source document is also reported in the dataset. Identified contexts include agricultural, industrial, not anthropized, urban, and artificial environments. In some case, studies may be associated with multiple environmental contexts simultaneously.

Further information is then provided on the origin of MTE, describing the type of activities and sources accountable for the presence of MTE in the growing environment. This parameter, tightly linked to the environmental context, includes agricultural, industrial, not anthropogenic, urban and artificial sources. Additionally, a specific modality is included for past or ongoing mining activities which are reported in several studies as sources of MTE contamination.

This section on experimentation background concludes with a free-text comment column, providing any additional information on experimental design which may help users to understand the experiment.

The bibliographic references of the source documents are given directly in the last part of the dataset for each experiment recorded, except for confidential data from private companies. The first two columns of this section list the first author, and either the second author or‘et al.’ for publications with more than two authors. The title of the source document is then given as well as the year of the publication, the name of the journal where the document is published, the volume, issue number and pages. The DOI affiliated to the source document is also indicated when available.

Additional information include the country where the experiment was conducted and, finally, the nature of the source document categorized using normalized code with ‘ART’ for peer-reviewed scientific article, ‘EDR’ for environmental diagnose report, ‘EXP’ for public experimental report and ‘THESE’ for thesis manuscript.

Write A Comment

Pin