Open Digital Mathematics Library

1 Background

This site is being created as a response to the forthcoming article and recent report11National Research Council, Developing a 21st Century Global Library for Mathematics Research, Washington, D.C.: The National Academies Press, 2014. (Henceforth, the Report.) from Jim Pitman and collaborators about an Open Digital Mathematics Library (ODML). The site is written with skelml, a simple tool that has recently been developed to use Github together with LaTeXML. This model provides a nice, neutral, supported, and standards-compliant place to collaborate. To join, please visit, and fork, https://github.com/holtzermann17/skelodml and send us pull requests. You can also edit the wiki or submit issues directly.

2 ODML stakeholders

Use cases for an Open Digital Mathematics Library for different stakeholder groups are as follows:22Adapted from Corneli and Mikroyannidis, 2012 “Crowdsourcing Education: A Role-Based Analysis” published in Collaborative Learning 2.0: Open Educational Resources, Alexandra Okada, Teresa Connolly, and Peter Scott (eds.), IGI Global, 2012

  1. I.

    (Student) The ODML is helpful for me because it connects me with high-quality, low-cost learning resources that are personalized to my current skill set and learning goals.

  2. II.

    (Teacher) The ODML helps me find and review high-quality, low-cost learning resources, connecting with other classrooms around the world.

  3. III.

    (Researcher) The ODML helps me find relevant readings and rapidly build knowledge in new research areas, integrates useful computational tools, helps me share my research and discuss the publications of others.33Researchers are the primary target audience considered in the Report. “The committee envisions its target digital library users to be working research mathematicians and advanced graduate students beginning their research careers throughout the world (hence the word global). The library discussed does not specifically target students below the advanced graduate student level or researchers outside of mathematics, although both sets would likely constitute some of the library’s user base. Having a clear understanding of the target user base directly impacts the types of content the library targets and the types of services it provides.” (p. 9)

  4. IV.

    (Administrator) The ODML is helpful for me because it means that we don’t have to develop everything in-house: we can contribute to a high-quality shared open source resource that is customizable to the needs of our organization.

  5. V.

    (Advocate) The ODML is helpful for me because the organization lets me advance the projects of the particular constituencies I serve transparently, getting feedback and help from other parties working in closely related areas.

  6. VI.

    (Regulator) The ODML is helpful for me because it serves the whole field of mathematics and mathematical science, and helps show clearly how various areas and programmes are doing.

3 ODML focal areas

We are interested in research and development in the following areas:

  1. A.

    an ontology for mathematics (or, for a start, a mathematical wordnet) (discuss)

  2. B.

    a semantically meaningful encoding of mathematical definitions, theorems, formulas, lemmas, &c. (discuss)

  3. C.

    semantic mathematical search (discuss)

  4. D.
  5. E.

    tools for assisting mathematical proof (and didactics) based on semantically encoded mathematics (discuss)

  6. F.

    manual and computerized (machine-learning based) tagging of mathematical papers (discuss)

  7. G.

    annotation tools for PDF files and scanned papers (discuss)

  8. H.

    support for open semantic docs and authoring workflows: (X)HTML, Kohlhase’s stex, etc. (discuss)

  9. I.

    community/authority management like mathoverflow, for management and production of collections (discuss)

  10. J.

    structured data management tools: integration with Zotero, BibServer, the Semantic Web, and other tools (discuss)

  11. K.

    OCR for mathematics (discuss)

  12. L.

    User interface enhancements that make it easier to work with mathematical content (discuss)

4 ODML Organization

We propose to offer:

  1. 1.

    a light-weight integration and organizational layer for coordinating work on the topics listed above (discuss)

  2. 2.

    an index of mailing lists relevant to the effort (discuss)

  3. 3.

    An overview of relevant technologies (discuss)

  4. 4.

    An overview of relevant sources of content (discuss)

5 Analysis of the offering

All of the stakeholders are served by useful content (4), and a significant body of content can be generated using OCR on existing public domain material (K).44The Report recommends against funding comprehensive OCR efforts, while at the same time suggesting that some OCR-related efforts should continue. One way out of the dilemma is to notice that significant progress could be made with relatively little funding, see, for example, this unfunded grant proposal for $21400, pitched to the Wikimedia Foundation. Of course, born-digital content is also very useful, and even better if it is written in a way that allows for increasing exposure of underlying semantics (H).

Students (I): Personalization is supported by semantic encodings and tools for navigating, assembling, and interacting with the content in useful ways (A, B, C, E).

Teachers (II): As above (A, B, C, E), and also interested in tools for managing classroom data (I, J).

Researchers (III): There are several use cases.

  • Finding relevant readings and rapidly building knowledge may be similar to the student needs described above (A, B, C, E) but the greater sophistication of researchers will also allow them to make good use of less-structured tools (D, F).

  • Making use of integrated computational tools requires transparent encodings, which will often be well-served by open/transparent authoring tools (H).

  • Sharing and discussing research is well-served by annotation tools and community workflow management systems (G, I); data management tools and cutting-edge UI features will also be useful (J, L).

Administrators (IV): Particularly well-served by integrated open tools and content that they can “just use” (1, 3, 4).

Advocates (V): Advocates (representing companies, professional societies, university systems, etc.) are well served by places to discuss priorities and a light-weight infrastructure that can collect the key ideas and turn them into actionable projects (1, 2, 3).

Regulators (VI): In addition to wanting to participate in the discussions, like other advocates, regulators are particularly well served by robust and meaningful collections of data (J).

6 Plans

We need to describe the state of the art for items AL in some detail, and check whether this list of focal areas is complete. Ideally we would get feedback from people in the various stakeholder groups described above. Building on this, we can look for a reasonable division of work. Often, this will just mean integrating projects that are underway. Work on integration or enhancing some of the focal areas may fit together into a grant proposal. For example, items G, H and K might fit naturally together, if we want to have annotated versions of legacy documents that correspond on a line-by-line basis to open docs on the web.

7 Examples

As a simple illustration of the sort of thing that can be usefully stored in this repository, we’ve added a document with instructions for installing BibServer, one of the relevant technologies. Eventually, this repository will contain many similar how-tos and useful pointers to tools and content.

As a more detailed narrative example, we will add a response to the above priorities from the point of view of PlanetMath.org. PlanetMath’s historical strengths are in community management software that can be used to discuss mathematical content (I). It has a modern software system, Planetary, which based on Drupal 7 and LaTeXML. Some natural next steps for the project would be to integrate PlanetMath’s software with BibServer (J) and improve its authoring tools and its UI so that it can be used to author and curate nice-looking textbooks and monographs (H, L). Basic improvements would integrate Frédéric Wang’s work on MathML display, and integrate Git for authoring55Here we are inspired by Peter Ralph’s skelml and the recent successful development of the open Homotopy Type Theory by the Univalent Foundations Programe at the Institute for Advanced Study. Git integration has already been demoed on MathHub.info, but the more complex authority model on PlanetMath poses additional challenges. Further improvements would make Planetary’s interface more comfortable for reading long documents, that is, more similar to the experience of reading PDF articles – but with the benefit that documents can be assembled, adjusted, and in other ways interacted with on the fly66Recent work in development at Authorea.com is similar in spirit.. Finally, working at the (inter-)library level, we would build on the ElasticSearch engine used by BibServer, Drupal’s Universally Unique Identifier module, and Planetary’s existing Semantic Web integration to make it so that individual instances of Planetary can communicate and interoperate effectively, supporting “interlibrary loans” and global searches (C). This would be a useful early contribution to the broader ODML effort: the new system could be straightforwardly adapted to support various interconnected libraries suited to different needs, making it easy to build real working systems demoing further features, as those become available.

8 Related work

The Digital Public Library of America (DPLA) provides mailing lists and tools, connecting users to a range of library resources.

(More to follow!)