Mc-stan documentation license precludes easy reuse

I was hoping to make the Stan documentation available on https://devdocs.io/. This website is an open-source tool for developers to search popular code documentation through a single interface. It makes documentation available by scraping open-source documentations available as HTML or other markup formats and transforming them into a uniform searchable database.

Unfortunately, the license for the Stan documentation is CC-BY-ND 4.0 docs/LICENSE at master · stan-dev/docs · GitHub . The “ND” part of that license (no derivatives) to me pretty strongly suggests that the mc-stan documentation is not compatible with ingestion into devdocs.io, but I’m no lawyer. Another oddity: The PDF version of the documentation is already available as CC-BY 4.0. So I suppose I could scrape the text in the PDF, but not the underlying Rmarkdown or HTML files?

What was the purpose of using the no-derivatives license for the Stan documentation? Could the license be changed to CC-BY 4.0 instead? Why does the PDF have a different license than the Rmarkdown that was used to create the HTML and PDF files? Can I scrape text from the HTML files as long as the text also exists in the PDF? Could I run the process of transforming the Markdown into a PDF myself and use the intermediate HTML/Latex or other files instead of the final PDF for ingestion into devdocs.io?

2 Likes

The @SGB needs to weigh in here.

2 Likes

@Bob_Carpenter - do you remember why the “ND”?

@winni2k - devdocs looks great, but the Stan docs are versioned -

how would this work on devdocs?

It supports versioning, tho I don’t know how it’s deployed.

Not a lawyer as well, but what part of the ND license is the issue? Just reading from the link below it seems like changing the format does not create a derivative work?

https://creativecommons.org/licenses/by-nd/4.0/

Interesting point. I am just starting to get familiar with how devdocs ingests documentation, so I cannot yet speak to how much transformation actually occurs beyond reformatting. In the mean-time, here is the relevant excerpt from the full License:

Section 1 – Definitions.

  1. Adapted Material means material subject to Copyright and Similar Rights that is derived from or based upon the Licensed Material and in which the Licensed Material is translated, altered, arranged, transformed, or otherwise modified in a manner requiring permission under the Copyright and Similar Rights held by the Licensor. For purposes of this Public License, where the Licensed Material is a musical work, performance, or sound recording, Adapted Material is always produced where the Licensed Material is synched in timed relation with a moving image.

What if the metadata of the doc HTML files are rewritten or sections are merged/split for better presentation in devdocs. Is the material then being “arranged … in a manner requiring permission under the Copyright…”?

Now we need to talk to lawyers to figure it out. Sounds a bit like the license issue with Kallisto: I was wrong (part 2) | Bits of DNA

My experience has been that each time a new doc version is released, that version is scraped and added to the menu of available docs. It is up to the user to select which version(s) of the code they want to include in their personal library.

Modified to be numbered.

  1. The thinking around no-derivatives is that we didn’t want people rewriting our documentation and distributing it as Stan documentation. My understanding of “no derivatives” is that you can’t modify the text. I think just redistributing it in a different display format would be fine. At least it was our intent for it to be fine. That feels like it agrees with the CC FAQ, but I’m also not a lawyer.

  2. Only with the approval of the @SGB. I’ll try to ping them and make that happen beyond this way. I’d be OK changing it to CC-BY.

  3. Sloppiness on our part. We didn’t intend to release under different licenses. Both should be CC-BY ND at this point after we made the change to ND. But obviously they’re not, which you could exploit to release a derivative product of the pdf.

4, 5. I’m not sure what you’re asking, but there’s no reason to scrape HTML because they’re all available as markdown docs.

Im not really understanding how the two are similar?

Skimming the actual license it seems like section 2.a.4 says reformatting is totally fine?

  1. Media and formats; technical modifications allowed. The Licensor authorizes You to exercise the Licensed Rights in all media and formats whether now known or hereafter created, and to make technical modifications necessary to do so. The Licensor waives and/or agrees not to assert any right or authority to forbid You from making technical modifications necessary to exercise the Licensed Rights, including technical modifications necessary to circumvent Effective Technological Measures. For purposes of this Public License, simply making modifications authorized by this Section 2(a)(4) never produces Adapted Material.

Right-o, I’ll talk to the folks at devdocs.io and see if they would like to take on the risk of a new license – this would be the first documentation in devdocs that uses an ND clause.

I’ve contacted the SGB and @spinkney said they want to talk to a lawyer. Pretty much everyone’s response as soon as licenses get involved. So we should be able to roll this over to CC BY for the HTML doc, but it’s not going to be immediate with lawyers involved on our side.

Ok. I’ll keep an eye out for the change.

Have the folks at devdocs gotten back to you? imo reading over the license it seems totally fine for your use case (see the section in my comment here which I think directly addresses your use case).

I looked into changing our license from CC BY to CC BY-ND and discussed with our governing body. The obstacle to doing this is that the GitHub repository lists the license as CC BY-ND and the project doesn’t own the copyright to its code (the individual devs do). That means we can’t just change the license to CC BY going forward. So that’s not going to happen.

We didn’t intend to prevent people from reformatting the output. We wanted to stop people providing incomplete or badly modified doc for Stan. But we’re not lawyers and if you want more assurance, I’m afraid you’re going to have to talk to your own lawyers about what rights the CC BY-ND license grants. Sorry about that.

Thanks for everyone’s input! I have been trying to run this by the devdocs team for several weeks now, but no response yet. I’ll keep trying…

Hi everyone, I came across this blog post from Creative Commons that I thought might be relevant to this discussion: Why Sharing Academic Publications Under “No Derivatives” Licenses is Misguided - Creative Commons

Creative Commons argues in this post that the ND clause should not be used for academic publications because it is poorly suited for upholding academic integrity, which includes academic rigor, and because it precludes the kind of adaptation of work that is integral to academic progress. I see many parallels in the issues pointed out in the post and the issues we have encountered in this discussion.

The authors of the blog post also point out that

  1. the ND clause is not open access
  2. the definition of derivative work may vary across legal jurisdictions
  3. translations to other languages are prohibited under the ND clause

Do we even know if it is legal for mc-stan to create and distribute PDFs from user-contributed markdown files under the ND clause? It sounds to me like a PDF created from markdown is already a derivative work.

this was brought to the attention of the SGB last summer. we consulted with NumFOCUS, and decided to continue to use the license.

1 Like

Thank you for your attention to this Mitzi. I now buy the argument that converting a markdown to pdf can be seen as a format conversion under the cc-by-nd license. I think even the kind of processing that devdocs performs probably counts as reformatting.

1 Like