Preparation of the databases
Paleoceanographer accepts calibration and sampling databases as rectangular matrices in CSV format. In the calibration database, the first column is reserved for core-top sample identifiers. The following columns correspond to taxonomic variables, where the values represent the taxon frequencies within each assemblage. The last five columns are reserved for the variables to be estimated. Among these, the first three correspond to the oceanographic variables used in Autoevaluation or Parameters functions (SST, seasonality, and SSS). The final two columns are used in the Analogs function to store distances and geographic coordinates.
The sampling database must also reserve the first column for sample identifiers and include the same taxonomic variables, sorted in the same order as in the calibration database. Unlike the calibration set, however, the sampling dataset must exclude the five columns corresponding to oceanographic and geographic variables.