New governance, release with new features

governance
releases
Author

Toby Dylan Hocking

Published

January 30, 2024

I am proud to report that today, the first major new data.table features in several years have been released to CRAN!

This new release, version 1.15.0, is remarkable because it is the first new feature release using the new community governance, which was adopted last month.

Here is a brief timeline of the recent activities.

Governance discussion

An ongoing discussion

mean(1:10)

In Aug 2023, I started a discussion in issue#5676 about the process and goal of creating a formal governance document. 17 people commented on that issue, and the consensus was to publish a first draft community governance document in Nov 2023. Discussion in related issues included:

  • What are the guiding principles of the project? issue#5693
  • What is within scope for features? issue#5722
  • What roles/permissions should we define, and how can people obtain them? My proposal.
  • What conventions should we use for version numbers? issue#5715
  • What code of conduct should we adopt? issue#5708
  • What communication should we expect between the CRAN maintainer and the rest of the dev team? issue#5714

At first, it was not clear that commenters on that issue would agree to adopt a community governance structure, out of respect for the original creator, Matt Dowle, who had not yet expressed his approval of the process. However, that changed on 11 Sep 2023, when I posted Matt’s letters of collaboration that he signed in support of my NSF POSE project (after I asked Matt over email, and he agreed that I post them publicly). After that, there was a much stronger support of the proposed process.

Governance draft and adoption

Governor Sea Lion

After much discussion in the above linked issues, I published an initial draft of the governance document on 27 Nov 2023 in PR#5772. That PR was extensively reviewed by four of the current/active contributors: Michael Chirico, Jan Gorecki, Tyson Barrett and Ben Schwendinger After several rounds of comments and revisions, Jan merged the PR on 14 Dec 2023, which signaled the official adoption of the new governance of the project. It can be viewed in the GOVERNANCE.md file in the git repo.

Importantly, the new governance defines five roles for people involved in the project:

Contributor: Any member of the public at large who participates in issue discussions, code reviews, or pull requests for data.table.

Project member: Anyone who has contributed a substantial accepted update - whether technical or documentation based - to data.table.

Reviewer: A project member who volunteers to help review other contributions.

Committer: Given merge permissions on main GitHub branch; responsible for reviewing and incorporating updates. Currently: myself, Jan, and Michael.

CRAN maintainer: Responsible for organizing new releases on GitHub and CRAN. Currently Tyson Barrett.

Interestingly, CRAN maintainer and Committer permissions are largely orthogonal, and in fact Tyson does not currently have the Committer role permissions.

Any of these roles is possible to obtain, using the process described in the governance document. If you are at all interested, we could definitely use your help! Please get in contact by commenting on issues/PRs on GitHub.

Release 1.15.0 with new features

The new measure function

As outlined in the governance document, section CRAN updates, each release should be discussed and approved by consensus in an issue. The issue that we used for this 1.15.0 release is issue#5823. As can be seen in the NEWS.md file, this new release includes 20 NOTES, 55 BUG FIXES, and 41 NEW FEATURES, and 1 BREAKING CHANGE. Among the new features, I am most excited about one that I implemented: the new measure() function, which makes it easier to do complex wide-to-long reshape operations, as below:

library(data.table)
(iris.dt <- data.table(iris)[1])
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <num>       <num>        <num>       <num>  <fctr>
1:          5.1         3.5          1.4         0.2  setosa
melt(iris.dt, measure.vars=measure(part, dim, sep="."))
   Species   part    dim value
    <fctr> <char> <char> <num>
1:  setosa  Sepal Length   5.1
2:  setosa  Sepal  Width   3.5
3:  setosa  Petal Length   1.4
4:  setosa  Petal  Width   0.2
melt(iris.dt, measure.vars=measure(value.name, dim, sep="."))
   Species    dim Sepal Petal
    <fctr> <char> <num> <num>
1:  setosa Length   5.1   1.4
2:  setosa  Width   3.5   0.2

Outlook for the future

Since adopting the community governance document, there has been a lot of new activity on GitHub, and I am looking forward to seeing even more in the months to come. For example, now that we have adopted a code of conduct, we are eligible to apply for NumFOCUS funding, see discussion in issue#5676. Finally, if you use data.table, and are interested to contribute toward our next release, we could use your help, so please contact us in an issue/PR.

No matching items