In this tutorial, we focus on validation both from a numerical and conceptual point of view. The often applied reported procedure in the literature of (repeatedly) dividing a dataset randomly into a calibration and test set must be applied with care. It can only be justified when there is no systematic stratification of the objects that will affect the validated estimates or figures of merits such as RMSE or R2. The various levels of validation may, typically, be repeatability, reproducibility, and instrument and raw material variation. Examples of how one data set can be validated across this background information illustrate that it will affect the figures of merits as well as the dimensionality of the models. Even more important is the robustness of the models for predicting future samples. Another aspect that is brought to attention is validation in terms of the overall conclusions when observing a specific system. One example is to apply several methods for finding the significant variables and see if there is a consensus subset that also matches what is reported in the literature or based on the underlying chemistry.

Validation of chemometric models - A tutorial / Westad, Frank; Marini, Federico. - In: ANALYTICA CHIMICA ACTA. - ISSN 0003-2670. - STAMPA. - 893:(2015), pp. 14-24. [10.1016/j.aca.2015.06.056]

Validation of chemometric models - A tutorial

MARINI, Federico
2015

Abstract

In this tutorial, we focus on validation both from a numerical and conceptual point of view. The often applied reported procedure in the literature of (repeatedly) dividing a dataset randomly into a calibration and test set must be applied with care. It can only be justified when there is no systematic stratification of the objects that will affect the validated estimates or figures of merits such as RMSE or R2. The various levels of validation may, typically, be repeatability, reproducibility, and instrument and raw material variation. Examples of how one data set can be validated across this background information illustrate that it will affect the figures of merits as well as the dimensionality of the models. Even more important is the robustness of the models for predicting future samples. Another aspect that is brought to attention is validation in terms of the overall conclusions when observing a specific system. One example is to apply several methods for finding the significant variables and see if there is a consensus subset that also matches what is reported in the literature or based on the underlying chemistry.
2015
chemometrics; cross-validation; resampling; test set; validation; biochemistry; analytical chemistry; spectroscopy; environmental chemistry
01 Pubblicazione su rivista::01a Articolo in rivista
Validation of chemometric models - A tutorial / Westad, Frank; Marini, Federico. - In: ANALYTICA CHIMICA ACTA. - ISSN 0003-2670. - STAMPA. - 893:(2015), pp. 14-24. [10.1016/j.aca.2015.06.056]
File allegati a questo prodotto
File Dimensione Formato  
Westad_Validation-of-chemometric_2015.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.63 MB
Formato Adobe PDF
1.63 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/845388
Citazioni
  • ???jsp.display-item.citation.pmc??? 30
  • Scopus 248
  • ???jsp.display-item.citation.isi??? 230
social impact