Publishing biodiversity data

Tutorial ›› Biodiversity data ››
Parent Previous Next

ARBIMS uses GBIF's IPT (Integrated Publishing Toolkit) to publish biodiversity data and manage published resources (datasets).

IPT is a Java Application developed by GBIF to support biodiversity dataset publication in a common format: the DarwinCore Archive. The IPTs two primary functions are therefore to 1) encode existing species occurrence datasets and checklists, such as records from natural history collections or observations, in the DarwinCore standard to enhance interoperability of data, and 2) publish and archive data and metadata for broad use in a Darwin Core Archive, a set of files following a standard format.

In this regard, biodiversity data (occurence data or checklists) published through ARCOS' IPT installation need to comply to DarwinCore standard and therefore, users are advised to perform all necessary steps required to map their data to the terms of this standard before starting the data publication process.

The process is then performed in the steps below:


Step 1: Mapping data fields to the terms of Simple DarwinCore


The Simple DarwinCore standard has a total of 176 terms and though data published to the ARBIMS Portal doesn't have to have all these fields populated with data, there is a minimum of fields to be non-empty as set forth in the ARCOS Biodiversity Data policy. However, please note this is not a requirement for the standard itself which in essence doesn't make any of the terms mandatory. When mapping data, users are asked to pay attention to "overloading" any term (putting data under a filed whose term definition doesn't match well the data being populated therein). Instead, useful information which doesn't match any of the field should be "payloaded" into the DynamicProperties fields as recommended by the standard. Also, users are called upon to using the "controlled vocabulary" for the values of fields that recommend them.

The data fields mapping exercise constitutes the single most step in the data publication process since data quality will very much depend on how well the data holder matched data values with the corresponding Simple DarwinCore terms and how many fields are populated with data.

Since IPT accepts delimited text files (such as CSV) as source data, users are encouraged to use spreadsheets (such MS Excel sheets) during this data mapping exercise given the fact that after the exercise, the data will be in the appropriate format to be uploaded onto IPT.

IPT also accepts connections to most SQL-based databases and therefore users who keep their data in a database can directly upload data without the need to export the data into a flat file first. In this case, the structure of returned data from the database query used should match with the Simple DarwinCore terms and to simplifiy the field matching process, the use of aliases are recommended.



Step 2: Login to ARCOS' IPT site and create a new resource


The ARCOS' IPT installation administrator will need to give data contributors a set of login credentials (including a username and password) and in order to be able to publish datasets to IPT, the user account has to be at Manager or Admistrator levels.

Once the login credentials are received, please sign to ARCOS' IPT site at: http://arbmis.arcosnetwork.org/ipt

By clicking to "Manage Resources" menu and then going to the CREATE NEW RESOURCE section at the bottom of the page and filling in the name of the new dataset to be published, you will initiate the dataset publication process. Optionally, you can choose the type of the resource to be published: Occurence, Checklist or Metadata only.




Step 3: Upload the prepared source data



Step 4: Author the metadata



IPT uses the Ecological Metadata Language (EML) which is a metadata specification developed by the ecology discipline and for the ecology discipline. The IPT metadata pages are descriptive enough for users to be able to fill in the series of forms that are needed to complete your dataset metadata profile. Users are encouraged to go through all of the 12 pages of the IPT metadata authoring facility to fill in all of the relevant information on their data.


Step 5: Publish your new dataset


Created with the Personal Edition of HelpNDoc: Free PDF documentation generator