Professor Martin Witteberg, head of the social science date archive DateFirst. (Photo: University of Cape Town)

DataFirst, a research unit and data service based at the University of Cape Town, provides researchers with online access to survey and administrative microdata from South Africa and other African countries.

It has become the leading social science data archive on the African continent and the only one with the “data seal of approval” ( It has placed South African and African data at the core of some of the most exciting research projects, ranging from public health to development and inequality.

DataFirst is led by Professor Martin Wittenberg, whose fascination with social survey data dates back to 1995 when data from the first nationally representative survey of South Africans became available. That survey had been conducted by the South African Labour and Development Research Unit, with assistance from the World Bank.

“It was an eye-opener that one could put numbers to the extent of poverty and unemployment in South Africa,” says Wittenberg. “Angus Deaton (who won the Nobel Prize in Economics in 2015 for his contribution to the measurement of economic outcomes) was a key advisor on that survey. He showed us that there were many interesting policy issues that could be investigated using the tools of modern econometrics.

“He used the South African data to investigate some of those — for example, the impact of the old age pension — and he gave us the code as to how to do that work. I spent a good part of 1995 and the next few years with my honours students, working through his notes and code.”

DataFirst assists researchers to use the data via its online helpdesk and offers formal training courses in microdata analysis. It also trains African data managers in microdata curation, conducts research on the quality and usability of South African microdata, and works with African microdata producers to improve the quality of their data products.

DataFirst acts as an intermediary, making data available in a form that is relatively easy for academic researchers to use. It does not, however, “own” the data, so it has to ensure that it releases the information in a manner that is consistent with the wishes of the original data producer.

“I’m proud that DataFirst has made it much easier for young academics to obtain data in a form that is usable for research,” he says. “The data that we distribute is at the core of many of the debates about transformation and development. Furthermore, we run numerous courses to give researchers the skills to be able to do high quality work.”

There are several areas of research that DataFirst is pursuing. It is trying to find and preserve datasets from the apartheid era, because while data post-1994 exists, much of the data that was gathered before then has been lost.

It is also looking at new sources of data (in particular administrative data) to complement the traditional survey-based information. This is in addition to DataFirst supporting local projects and initiatives, such as the analysis of grant programmes and minimum wages, and assisting with the framing of the National Development Plan.

One of the biggest benefits of the data being publicly available and easily discoverable is that some of the biggest names in economic development can make contributions to the analysis of South African problems.