Reusable #1: (Meta)data are richly described with a plurality of accurate and relevant attributes
Labelling your data with relevant attributes - most often in the form of metadata – does not only help discovering your data. It also helps humans and machines to understand the context of your data. This can be in the form of purpose and processing statements, equipment used, software versions etc. Imagine finding your own data. Now think of the contextual information that would benefit you in determining whether the data is relevant to your specific needs - and whether you would be able to understand how the data were created. Be generous when adding attributes to your data. What might not be relevant to you might be the part of filtering and querying for data for other people - or machines.
Speaking of machines, your best choice would be to use e.g. controlled vocabularies, persistent identifiers or similar to make the contextual description unambiguous. Often repositories targeted towards specific disciplines, communities, or data types will have the most optimal support for both assigning, maintaining and querying using domain specific metadata.
Reusable #1.1: (Meta)data are released with a clear and accessible data usage license
Licensing data and metadata is an important aspect of the FAIR principles; both if you retain some or all rights, or if you set your data free as completely open data. In a FAIR context, a license is a standardised machine-readable statement that tells the end user exactly how he or she can use the data, and under which conditions.
You can apply many different licenses to data. One of the most common ones is the Creative Commons license suite, where you explicitly state how and if you are to be cited when your data are reused, along with possible re-sharing options of derived works etc. Typically, you choose a license when depositing data in a repository. You do not have to apply the same license to all your data, and stating full copyright is a license in itself. The worst thing you can do is not to apply any license at all. Because then your data is most likely protected by copyright as default, even though the data are publicly accessible.
If you use other people’s data, you should always identify the conditions of how and when you can use the data. This may affect the way you can work with their data.
Reusable #1.2: (Meta)data are associated with detailed provenance
Much of the value in data is the ability for a machine or a human to judge the origin of the data. Thereby often evaluating whether the data is reusable in a new context. This includes the ability to know how the data was created, by whom it was created, and with which type of equipment? Also, has the data been processed, or is it raw data? If it was processed, how was the workflow? And so on, and so on. This is quite similar to a section on method in a paper or article, and you can refer to this type of documentation from your data set. However, keep in mind that this might not be readable for a machine.
Remember to include provenance of who you are, and how you would like the data set to be cited/credited if used elsewhere.
The easiest way to get started, is to try and think of yourself as a re-user of your own data. But before you do so, you must clear your head of all knowledge related to the data set. What details would you need to evaluate and trust a given data set? If this is hard to imagine, try finding other people's data sets, and see if you think they have enough provenance.
Reusable #1.3: (Meta)data meet domain-relevant community standards
Working with data sets from a variety of resources is much simpler, if everybody agrees to a certain standardized way of organising and describing the data. That is why many disciplines have created metadata standards for describing data, and created lists of recommended file formats etc. Keeping in line with these standards will lead new data out into the ecosystem of data that is easy and suitable for others to reuse. Therefore, you should always try to be on the lookout for standards within your community and try to adhere to these. However, not everything can be standardized, of course, and many research disciplines are breaking new ground where there are currently no standards – and then you will turn to more generic standards, or begin inventing new ones.
Notice that standard in this sense is not a quality measurement indicating a level of high or low quality of the (meta)data. It should always be judged by the people who are re-using the data in their specific contexts.
Go to the webpage for A FAIRy tale for more information about the FAIR principles.
Based on 'A FAIRy tale' CC-BY-SA 4.0 ‘DK Fair på tværs’.