Are Creative Commons licenses appropriate for data?

Question

Are Creative Commons licenses appropriate for data?

commented Apr 8, 2017 by Daniel Mietchen (2.8k points)

3 Answers

Thomas · Answer 1 · 2015-08-06T12:42:43+0000

Well, this gets complicated and legal. (Caveat: I am not a lawyer.) According to Creative Commons, their licenses:

give everyone from individual creators to large companies and institutions a simple, standardized way to grant copyright permissions to their creative work.

In short, CC licenses apply to creative works and are meant to relax or waive the copyright protections automatically guaranteed to authors (e.g., by common law tradition in Commonwealth countries).

The applicability of CC licenses to data depends on whether data can be copyrighted. If data cannot be copyrighted, then there is no point to putting a CC license on them because those licenses waive rights that the data creators do not have.

So what kinds of works are protected by copyright? Though laws vary across jurisdictions (and thus make this question difficult to answer), two important principles are the "Idea-Expression divide" and "the threshold of originality". In the former, only expressions of ideas can be copyrighted, while ideas themselves cannot be. In the latter, among expressions, only those that are original are protected (thus reproductions of works do not earn copyright protection de novo).

Thus data only have copyright protection if they are an expression of an idea rather than idea itself and if they are not simply "facts" (i.e., they are something sufficiently original).

In the United States, this almost universally means that data cannot be copyrighted. A classic legal case here is Feist Publications, Inc., v. Rural Telephone Service Co., which ruled that telephone number listings in a phonebook are not protected by copyright. Importantly, nothing produced by the federal government has copyright protection (all federal government works are in the public domain, but this does not necessarily apply to other levels of government).
In Europe, however, databases do have copyright-like protection. Not all databases are protected; protection comes from "qualitatively and/or quantitatively a substantial investment in either the obtaining, verification or presentation of the contents". Such rights extend for 15 years.

Thus, one has to determine whether the "data" being discussed merit copyright on their own. It may be that "data" refer to works that are themselves copyrighted (e.g., original written works, such as newspaper articles). Those "data" are protected but not because they are data, rather because they are creative works. Someone who has compiled those works into a database in the United States has no copyright protection for the works (unless they have obtained those rights for each "data point" from the original author(s)). In Europe, however, the compilation of those data into a database may entitle the compiler to a limited database right.

In conclusion, CC licenses make sense if one has copyright protections to give away. If not, then CC licenses make no sense because the data are probably free to use anyway. If the data do merit protection (due to either satisfying European-style threshold of investment, or American-style threshold of originality, or some other national standard), then I believe the argument made in the linked webpage is purely made on the opinion that CC0/(or Public Domain, where that principle exists) are preferable to more restricted waivers of rights.

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)

Rex Kerr · Answer 2 · 2015-08-09T18:01:48+0000

Figshare has a FAQ that covers this, sort of. Whether the Panton Principles were formulated using the same reasoning I do not know, but I have heard the Figshare reasoning on several occasions. Dryad has a stronger and more lengthy explanation (click on "Why does Dryad use Creative Commons Zero" in their FAQs).

The fear is that since some data cannot be copyrighted, if you use something other than CC0, people may not realize that some data can be used without attribution (as it is not subject to copyright). This reduces use of the data, which is contrary to the goals of openness.

The claim has also been made that it might make various sorts of data aggregation difficult. Dryad, for instance, imagines that you can competently aggregate data from 50,000 sources and asks you to envision lawsuits from not attributing the sources.

Both of these strike me as tantamount to saying, "We are are bad programmers, and careless scientists."

Scientifically, you want to know where your data comes from so you can fix it if the upstream source fixes it. You don't want messy aggregate data sources with no idea where it came from. Yes, you have to be a good enough programmer to keep the tiny bit of metadata about where it came from (and can be used for attribution) associated with the data itself. For instance, if you can grab data from 50,000 sources and can't even manage to say who it is from, what confidence should we have in the quality of your work, analysis, conclusions, etc.? Having licenses enforce this kind of basic good practice seems like an advantage, not a detriment, to me.

That doesn't mean that every license is appropriate for data. Viral non-commercial licenses really limit how data can be used, as once they get mixed in to a data set, companies basically have to stop touching the data. Contrary to the ideal of sharing, this actually poisons sharing by legally enforcing a forbidden class. But the idea that CC0 is the only thing that's appropriate for data is not well-founded, even if it is a common view. CC-BY is really not problematic; the requirements are minimal. (In fact, Figshare sensibly allows CC-BY for figures and so on; last I checked, Dryad insists on CC0 for everything. And both say you should provide attribution anyway as good custom.)

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)

Are Creative Commons licenses appropriate for data?

Please log in or register to add a comment.

Please log in or register to answer this question.

3 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Categories

Most popular tags

Are Creative Commons licenses appropriate for data?

Please log in or register to add a comment.

Please log in or register to answer this question.

3 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Related questions

Categories

Most popular tags