Announcement: data.table translation projects

announcements
grant
funding opportunity
Published

October 17, 2023

In 2023-2025, National Science Foundation (NSF) has provided funds to support the project “Expanding the data.table ecosystem for efficient big data manipulation in R.” One of the goals of this project is to create translations from English to other languages, in order to make data.table more accessible.

Motivation: make data.table more accessible

data.table is widely used (1400+ other R packages depend on it), so it is an essential part of the R data analysis ecosystem. One of the three goals of the NSF project is to create new documentation materials, to encourage a wider diversity of users and contributors (see the other post for an overview of project goals). In this project, an important part of the documentation efforts will be creating translations of data.table documentation, in languages other than English. The goal of this translation project is to make data.table easier and more accessible, for people who do not natively speak English.

In fact, data.table already has its source code strings (errors, warnings, etc) translated into Chinese, which is a good start. The goal of this project will be expanding the translations to other written materials (vignettes, slides, etc), as well as other languages with a substantial R user base. Based on a prior analysis by Michael Chirico, priority languages will include Chinese, Portuguese, Spanish, French, Russian, Arabic, Hindi.

Grant-supported translations

The types of materials that we would like to translate are:

  • errors/warnings/messages defined in strings in R and C code. These are referred to as “messages” in the terminology of gettext. The potools package by Michael Chirico can help create the files in the required format. Some method of continuous updating/maintenance will be encouraged, such as github, or the R project weblate.
  • most important vignettes (intro, import, reshape)
  • other documentation (cheat sheets, slides, etc)

Over the course of the NSF project (Sep 2023 to Aug 2025), there will be 20 translation awards, each US$500.

How to apply

Applications should be submitted to toby.hocking@r-project.org, and should include the following information:

  • who is the project leader? Projects which include two or more people are encouraged, in order to proof-read translations, and ensure quality. The project leader should be a trusted member of the R community, who will be responsible for proof-reading and approving other project members’ translations.
  • how many people will be translating, and how many US$500 awards are you requesting?
  • which non-English language?
  • how many regional dialects are represented on your team? For example, Spanish dialects: Mexican, Chilean, etc.
  • for each person in your project, how many years using R and data.table?
  • what documents/materials do you propose to translate? (messages/vignettes/other)
  • what is your proposed timeline for completing the translation of these documents/materials?
  • what are your plans for ensuring that your translation is consistent with other R translations? (for example, base R messages, vignettes in other packages) In detail, there are several possible ways to translate any given text from English. For example, “computer” could be translated to Spanish as “ordinator” or “computador” or “computadora” but for consistency, only one of these should be used across all R documentation.
  • after the initial translation is complete, what is your plan for continued maintenance? In other words, when the corresponding English source files are updated on GitHub, what is your plan for being aware of those updates, and making corresponding updates of your translation, for a future release of data.table?
  • how will your translation project help create an expanded data.table ecosystem of users and contributors? You should make an argument that there are a large number of native speakers of your target language, using data such as in a prior analysis by Michael Chirico.

Applications will be reviewed on a monthly basis, using the following criteria:

  • What is the probability of success of this translation project, and its continued maintenace?
  • Will this translation project help create an expanded ecosystem of users and contributors?
No matching items