Supporting Research Data Management and Data Intensive Research: DIRISA

The Data Intensive Research Initiative of South Africa (DIRISA) promotes, enables and coordinates a data intensive research ecosystem in support of national science and strategic priorities. This is accomplished through several key objectives, including the provision of a robust and advanced national data infrastructure and services, the promotion of sound data stewardship practices and the development of underpinning expertise in data management and data intensive research. DIRISA operates as an overarching national data organisation and as such, advocates research data sharing, coordinates publicly-funded initiatives and advises on a strategic agenda for data intensive research.

DIRISA is expanding the national data infrastructure to accommodate increasing demands for research data storage, coordinates the national e-Science postgraduate training and teaching platform, and a regional research data centre managed by a consortium of academic institutions. The data services developed by DIRISA include a research data repository providing 20GB of storage for registered users, a long-term data storage facility, as well as a data discovery and analytical services that support more rapid data-based research innovation across all academic disciplines.

In supporting researchers to manage their research data, DIRISA has released an operational version of a research data management planning tool, called DMP Online. It is becoming standard practice for research funders to require such a plan as part of the research proposal as it supports improved data-driven research across all disciplines. As a collaborative effort by DIRISA and Universities South African (USAf), DMP Online has been demonstrated to local universities and research institutions to promote its adoption for use nationally.

DIRISA is implementing a Digital Object Identifier (DOI) service that allows users to assign a DOI to their research data collections. The use of these DOIs greatly improves the management of digital objects, including data and valuable national assets.

DIRISA hosts a national research data workshop annually, where researchers share their experiences and give input on their research data needs. DIRISA also conducts the Student Datathon Challenge that gives students an opportunity to use research data to produce creative and innovative solutions that help to solve some of South Africa’s challenges. This activity encourages the use of open data to promote open science and the development of data science skills early on in the careers of students.

Key Strategic Initiatives: DIRISA

  1. Provide a robust and reliable national data infrastructure for data intensive research with services such as a petascale 8 PB research data repository; and data sharing and archiving services through the deployment of a 40 PB long term storage facility. DIRISA itself, is conducting research and development on software defined data storage technologies specifically for local conditions.
  2. Support R&D that yields improved and scaled up technologies that underpin manufacturing and logistics, and the Fourth Industrial Revolution. Localised services for research data management based on market needs, are being developed.
  3. Federate existing research data repositories, such as Ilifu, into the DIRISA Tier 1 data node
  4. Together with, NRF and NIPMO, develop standards and policy recommendations to regulate the open and ethical use and management of research data. For DSI, develop strategies and frameworks for Open Data, Big Data and Open Science.
  5. Continue to run and expand the National eScience Postgraduate Teaching and Training Platform (NEPTTP) programme to other universities; coordinate workshops and training events in the data sciences and data management to re-skill and upskill researchers.
  6. Pursue partnerships with the IT industry and academia for co-funded collaborative development of IT services
  7. Provide a recommended national strategy for research big data
  8. Develop Open Data policy recommendations supporting the national Open Science framework, and guidelines to preserve valuable and important research data collections hosted by institutions and projects such as SARIR and the NRS.