Non-synomymous terms
We’ll be focusing on publishing and archiving
For the remaining slides we are going to assume that we are at the point of submitting our manuscript.
increased visibility / citation
funding agency (see the NWO Open Science Program)
journal requirements (see, e.g., PLOS Publishing policies, Nature Publishing policies)
community expects it
Figure 1. Distribution of reporting errors per paper for papers from which data were shared and from which no data were shared.
more efficient, less redundant science - by allowing others to build upon our work
Five selfish reasons to work reproducibly by Florian Markowetz
For research to be reproducible, the research products (data, code) need to be publicly available in a form that people can find and understand them. Ideally, both data and code are FAIR.
Catalog the artifacts you produced during this workshop (3 minutes)
share? yes!
share? maybe?
share? no!
Advice: One way to determine what you need to publish is to go through and redo the analyses in your paper. Make note of the data and code and notes you needed to do that analysis. Make sure all of that is available. This might seem time consuming, but it assures that what you think you did is what you actually did.
You can make your code and data public at any point of the research process.
However, at the point of paper submission, the results in your paper should be reproducible and therefore the data and code used in the paper published.
Discuss: Research products (code, data) published separately is different from journal supplementary materials. Why?
You will likely have different artifacts:
Possible workflow:
Do’s
Don’t’s
Using standard data formats is sometimes required, but even when it’s not, conforming to standards greatly increases opportunities for re-use and understanding.
README
that describes the data or software package
CITATION.cff
file which can also be used in RDocumenting your research:
collect all of the to-be-archived artifacts from the preceding lessons into a directory
write a README file that describes the contents of the directory
put a license or waiver on it
Copyright applies to creative works
Typically not copyrightable:
Note: This is not an exhaustive list. Ask your data steward or library if you need help!
CC0 enables scientists, educators, artists and other creators and owners of copyright- or database-protected content to waive those interests in their works and thereby place them as completely as possible in the public domain, so that others may freely build upon, enhance and reuse the works for any purposes without restriction under copyright or database law.
From the Panton Principles:
[…] in the scholarly research community the act of citation is a commonly held community norm when reusing another community member’s work.
Community norms can be a much more effective way of encouraging positive behaviour, such as citation, than applying licenses. A well functioning community supports its members in their application of norms, whereas licences can only be enforced through court action and thus invite people to ignore them when they are confident that this is unlikely.
Discussion: What are some of the challenges of publishing research products? What are some of the concerns that people have?
More and more specialized staff and services are available at university libraries. They provide great resources for data/software management as well as information and access to repositories. They are particularly good at thinking about data archives and increasingly providing support with code as well.