6 like 0 dislike
in Open Science by (515 points)

The digital object identifier system has become quite popular in the science publishing industry. The idea is that research outputs, such as articles and data sets, are assigned a DOI, which then is hyperlinked to an electronic copy of the resource. If for some reason the URI for the resource changes, the metadata for the resource need simply be updated in the DOI registry, and the DOI will continue to link to the resource's correct location. So my question is...

Of what value are DOIs in (open) science?

I understand the issue of resource stability and link rot. I too have found many URLs for departmental servers in the abstracts/introductions of ≤ 5-year-old research articles, and have experienced the frustration of pasting a URL into my web browser only to get an error saying the page doesn't exist (Joe Student graduated or Jane Faculty got hired elsewhere or some such). I think there is growing recognition that you can't just dump stuff on a server somewhere: research outputs need a stable online home.

So, assuming our research outputs do have a stable online home (article posted on arXiv or published through a publisher; code posted on GitHub1 or a similar service; data posted to figshare), then what value do DOIs add to the equation? In what way are they better than URIs to the various resources?

1 "Actually, code on GitHub isn't necessarily stable. The owner of a repository can delete the repository any time." -- Yeah, I just don't really see this as a big problem. I mean, maybe GitHub isn't the (final) answer. Maybe having Zenodo clone and store an immutable copy of your repo at the time of publication is the answer. But then the question remains: why would a DOI for that Zenodo repo be superior to a URI for the same Zenodo repo?

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)
by (155 points)
0 0
If you think that this thread should be migrated to Academia or another SE site because the OpenScience beta is closing, please edit the list of questions shortlisted for the migration [here](http://meta.openscience.stackexchange.com/questions/73/).

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)

1 Answer

8 like 0 dislike
by (820 points)

DOIs are just one type of persistent identifier. PURLs (Persistent URLs) are another. We would hope that arXiv IDs would be another too. A DOI simply provides a persistent identifier. Its advantage is that it is widely-used, and broadly adopted within scientific publishing.

You are mistaken to think that URLs to papers published through a publisher are stable. I can recall numerous occasions where the URL to a paper didn't work because the publisher had changed the structure of their website.

Why are DOIs better than URIs? I think this comes down to the fact that a DOI is an identifier that is entirely removed from the structure or layout of the website where the resource is located. URIs often aren't just identifiers of content on the web; they can include information about the filesystem layout for example or other structure of the website. The owner of the website might wish to change the structure of their website to make use of a new technology or as part of a redesign. If they do this, the link from the identifier to the resource is now broken but they have no consistent way to indicate this without maintaining themselves forever redirects from old URIs to new URIs. By separating the identifier from the layout of the web resource the identifier doesn't change if the website or server hosting the resource changes.

Also, in this scenario the DOI wouldn't change but in a revised website layout the URI to the resource would be different or you'd have multiple identifiers relating to the same work.


why would a DOI for that Zenodo repo be superior to a URI for the same Zenodo repo?

If Zenodo changed the layout of their site, the identifier would change and they'd need to maintain a link (a redirect) from the original URI to the new location for all time now.

Also, arXiv specifically has arXiv Article-IDs so that if they reorganise the web interface there is still a persistent identifier of each article (and revision). If there were no additional benefit over the URI, arXiv wouldn't have gone to the trouble of creating their own persistent identifiers. (One presumes cost is [or was] an issue in arXiv not providing DOIs.)

The same concept lies behind all these persistent identifiers, like PURLs, arXiv IDs etc. URIs just provide a way to reference a location not an identifier of that work, even if while referencing the location they act as an identifier. The key thing is persistence.

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)

Ask Open Science used to be called Open Science Q&A but we changed the name when we registered the domain ask-open-science.org. Everything else stays the same: We are still hosted by Bielefeld University.

If you participated in the Open Science beta at StackExchange, please reclaim your user account now – it's already here!

E-mail the webmaster

Legal notice

Privacy statement