Importing Data

Data relating to a previously created data model can be imported using the Data Importer tool. This can be accessed using the Open in importer right-click action on a selected compatible file in the Data manager pane. Currently, only Excel spreadsheets (.xls, .xlsm and .xlsx), as well as .csv files and .txt files (with tab separated values) can be imported.

../_images/open_in_importer.png

Limitations of the tool

Before getting into an example, it's worth noting the limitations placed on input spreadsheets. Data should be arranged with one observation (Entity) per column and separate rows for each variable (FactKind). For the auto-importing to work most effectively, each individual observation has to be in its own unique row. If values for one observation are split across different regions of the sheet (or different sheets) they have to be linked manually, which can be time consuming.

The data importer will also work on a transposed version of the spreadsheet (i.e. switching rows/columns in the above paragraph).

../_images/data-importer-tidy-data-term.png

An example spreadsheet with column-wise data, showing "tidy data" terminology.

If an input spreadsheet doesn't fit with these limitations, it may not be able to be imported successfully!

FactKind Detection

After selecting the spreadsheet file to be imported, it is displayed in the application. Files with multiple sheets are displayed in separate tabs.

../_images/data-importer-selected-model.png

Next, we select the appropriate model from the Model dropdown menu and click Detect Fact Kinds. Selecting a model will also show its structure, allowing verification that the selected model fits with the data.

../_images/data-importer-detect-fact-kinds.png

The application will attempt to automatically match FactKinds in the model to the spreadsheet, with successful matches being highlighted green. However, this is not always possible to do automatically for every part of the model and some conflicts have to be resolved manually.

Next, we can create a manual pairing between the Solvent Name FactKind and the alternate label S. Name used in the sheet. Clicking the button now labelled Match Solvent Name to cell S1, A4 will create the match (see screenshot below).

../_images/data-importer-manually-match-fact-kinds.png

Also, if the "Detect FactKinds" detects an incorrect pairing, we can delete it by selecting it in the middle table and pressing the - button.

A drop-down menu also allows setting the numerical formatting used for data, which can be set to either US (1,234.56) or EU (1.234,56).

Entity Creation

The next step is to parse the actual data in the sheet, creating individual Entities for each selected row or column. We click the "Detect Entities" button. The detected entities will be highlighted in blue.

../_images/data-importer-detect-entities.png

We can also choose only some cells to detect entities, by selecting the cells before pressing "Detect Entities"

If additional values need to be assigned to an entity, such as values in another sheet, we can select more cells and press "Detect Entities" again without losing the first ones which were detected.

When the created Entities are ready to be saved, click the Save Entities button.

Template Creation

To import excel files with similar structures in bulk, you must save a template mapping cells to fact kinds and entities.

After reviewing that the created Entities contain the correct data, click the Save Ingestion Template button next to the Save Entities button. This will open a new dialog where a name must be assigned to the import template. The name provided here must not be blank and must be unique. After entering a name that meets the above criteria, click the OK button to save this template to the database.

../_images/template-name-dialog.png