LibGuides: Data management: Data publishing and preservation

Publishing data and metadata

The openness of research data increases the visibility and impact of your research, creates new research opportunities, and facilitates disciplinary and interdisciplinary collaboration. Open data improves the transparency and reliability of science, empowering and democratizing science.

Publishing research data may create more opportunities for the researcher to gain merit as a researcher, e.g. through more citations or registered downloads. In this way, the researcher can gain acknowledgement for other stages of the research process apart from the published article.

Research data and related published research results produced at Arcada should be published openly and made available for shared use. The discoverability and citability of research data are to be ensured. When reusing data, normal citation practice applies.

When opening your data, consider the following questions:

1. How to describe and publish the metadata of your data?

Metadata are data about data and describe the context, content, structure, compilation, and management of research data (See the section on Metadata and data documentation on this page). Informative metadata is the key to making data open, understandable, and reusable.

2. What part of the data will be opened and published?

If you cannot open the data, open and publish the metadata of your research data. Note that the metadata of data holding personal information can possibly be open, although the actual data cannot be.
Data with personal information can only be published anonymized. Pseudonymized data is still personal data, and therefore cannot be opened without explicit consent for that purpose. See Anonymisation and Personal Data by the Finnish Social Science Data Archive (FSD).
The consent of the data subject is required for the opening of the material, from which the research participants are directly identifiable. If you plan to share data which includes personal information, contact the Data Protection Officer of Arcada, dataprotection@arcada.fi.

3. Where will the data be published?

Apply for storage space in IDA by contacting datamanagement@arcada.fi

IDA, Data storage service provided by CSC, part of the Fairdata services offered by the Ministry of Education and Culture and produced by CSC. Please read the instructions on how to Apply for IDA storage space.

Or choose another suitable repository for sharing and opening your data at the start of the project.

Criteria for choosing a repository include:
- Choose a repository which uses persistent identifiers (DOI, URN). See The use of Persistent Identifiers for Research Datasets: Recommendation by the Finnish Scientific Community for Open Research.
- A repository which publishes machine-readable metadata and uses a known metadata standard.
- A repository often used by your colleagues. Also check the recommendations of the publishers, learned societies, and funders in your own field.
- A repository which allows you to choose the terms of use under which the data can be reused, and states them clearly as part of the metadata.
Check specific repositories for one data type in re3data.org, a registry of research data repositories covering over 2,000 repositories.
Other general repositories include:
- Aila by the Finnish Social Science Data Archive (FSD). You can contact them directly asiakaspalvelu.fsd@uta.fi for further assistance.
- Zenodo by the OpenAIRE project and CERN.

4. When will the data be available? Do you need to set any embargo period?

5. Which license will you use to open and share your data? Licensing is necessary for publishing data. It is recommended to use Creative Commons (CC) licenses for open research data.

6. Will some part of the data be destroyed? See Data disposal by the Finnish Social Science Data Archive (FSD) and Five steps to decide what data to keep by the Digital Curation Centre (DCC).

Metadata and data documentation

Data documentation means describing the data, is data about data, and provides information about the who, what, when, where, why, how of the data. Investing time in documenting the data makes it easy to understand them for both others and yourself, and decrease the risk of false interpretation of the data. Data documentation can be a readme file (human readable) and metadata (computer readable):

Readme files are text documents (e.g. in the format .txt) providing information about data files to ensure they are interpreted correctly. A readme file explains what data a research project has, how the data were created, where the data originate from, how to interpret them, what the abbreviations mean, what software is needed to use the data, how the data have been modified, and can include information about the title, creator, funder, relevant dates of data collection and publication, location, methodology, subject, file formats, file naming system and folder structure, data version, license, and repository.

Write a readme file about your data and data files. Put the readme file in the most obvious place in the data file folders to ensure that it can be noticed and seen immediately.

Metadata are technical data that describe a research dataset. When making data FAIR, metadata play a key role. Systematically described research data is the key to making your data understandable, findable and reusable

Metadata should be machine-readable. There are standard methods available for data documentation called metadata standards, which should be used if suitable for the data. The Fairdata Qvain metadata tool makes describing and publishing research data smooth and effortless for researchers without requiring technical skills.

Data described and published by Qvain metadata tool are automatically transferred to Finnish metadata warehouse Metax, which is integrated with both Etsin (research dataset finder) and the Finnish National Research Information Hub (in Finnish: Tutkimustietovaranto, a service also commissioned by the Ministry of Education and CSC).

See Qvain User Guide.

Other important issues include data formats, file naming conventions, version control, and directory structure. See Data formats and organizing.

For more information, see:

Data documentation by CSC
Data description and metadata by the Finnish Social Science Data Archive (FSD)
Making a research project understandable - Guide for data documentation by Siiri Fuchs and Mari Elisa Kuusniemi at Helsinki University Library
Disciplinary Metadata by the Digital Curation Centre (DCC)

Long-term preservation of data

Long-term preservation means that data is preserved for more than 25 years. When creating your data, you need to consider how long it will be preserved. Also remember to check discipline-specific, funder-related, and publishers' data preservation time length requirements.

Finnish Ministry of Education and Culture has established the Fairdata-PAS service (Digital Preservation Service for Research Data, DPS for Research Data) for Finnish research organizations for long-term preservation of the nationally most significant research data.

See Digital Preservation (Fairdata-PAS): Guidelines for UH Evaluators by the University of Helsinki.

If you are interested in Fairdata-PAS, contact datamanagement@arcada.fi.

FAIR data principles

The FAIR data principles, formulated and published by Force 11, are a set of guiding principles for good data management and open access to research data. FAIR is an acronym that stands for Findable, Accessible, Interoperable and Reusable. Research data that are published according to the FAIR principles should be easy to find, access, transfer or combine, and reuse.

In order to ensure that your data and/or their metadata are FAIR, follow the following steps:

Save your data in an open file format such as Rich Text Format (.rtf) or .csv.
Archive your data in an established digital repository at the end of the project. Remember to choose a repository that provides a persistent identifier (PID), such as DOI or URN.
Create descriptive metadata for the data. Most of the FAIR data principles concerns metadata. See Data documentation and metadata.
License your data with a license that clearly state the conditions and restrictions for reuse.

To learn more:

It is recommended to use the Fairdata services offered by the Ministry of Education and Culture and produced by CSC – IT Center for Science Ltd for data management, data storage, metadata creation, dataset dissemination and distribution as well as digital preservation of research data. The services include:

IDA
Research Data Storage – Safe storage for research data.
Qvain
Research Metadata Tool – A metadata tool for describing and publishing datasets.
Etsin
Research Dataset Finder – Discover, access and download research data from all fields of science.
PAS - Digital Preservation Service for Research Data
Reliable preservation of digital information for decades or even centuries.

Read How to make the research dataset FAIR? and learn more about the Fairdata services.