Rethinking Data Collection: Collect the Collected!


By Esther Smits, KIT Impact, Evaluation & Learning

I give him a questioning look: “And? How did it go today?” He sighs and explains: “It was not easy; people are not eager to speak to us. They were already visited by a team from the Ministry of Agriculture last week and by an NGO the week before. Everybody is asking the same questions. They are tired of it and feel they get nothing in return.” After taking some time to reflect, I conclude that it perfectly sums up the run on data in the development business. Survey fatigue is common among respondents; new data teams come to collect data, but the analysis and conclusions are usually drawn far away from communities they affect and rarely make it back.

The situation described above happened about a year ago in Ghana, in pre-COVID times. It is understandable, given the frequency at which data is collected by individual projects (often at regular intervals in the lifetime of a project), organisations or governments. In an age where data collection is becoming cheaper and easier logistically, the demand for project evaluation and transparency is increasing from all sides. Primary field data collection is now standard practice for most projects.

Many people believe that it is better or easier to collect new data than to look back at pre-existing data. After all, we often hear that every project is unique and that collecting data tailored to a project’s goals is the only way to properly report on its outputs, outcomes and impact.

However, while every project has unique elements, many agricultural projects also share similarities. And, consequently, so do their project evaluations. There are, for example, few datasets that do not include information on household wealth, food security, gender roles or agricultural activity. These similarities create opportunities for learning.

Now is the time to make the most of your pre-existing data

As with most organisations, COVID-19 has dramatically influenced the day-to-day nature of our work. Heavy travel restrictions mean that for the time being, we are no longer in the field, collecting and analysing data and learning first-hand about people’s experiences. At the same time, the pandemic is creating new demands for data, and the need for evidence-based decision-making is more urgent than ever. We see this as an opportunity to reflect on our usual methods and to look for alternative ways of doing our work.

At KIT, our Impact, Evaluations and Learning team has always focused on understanding how projects perform and identifying the factors that contribute to that performance, and that hasn’t changed. To some extent, we can continue our work by organising remote data collection, facilitated by our strong network of local partners. But we are also making use of what we already have: a gold mine of pre-existing data.

We live in an age where data is increasingly available. It is hard to find an organisation that does not have dozens of datasets stored on their computers. Perhaps it hasn’t been properly cleaned yet? Maybe it was only partially analysed, focusing on a limited number of variables? Or possibly collecting new evaluation data was simply considered easier?

Whatever the case, it was doubtless not used to its full potential. With movement and human interaction restricted and new data collection next to impossible, now is the time to dive into forgotten, neglected and underutilised datasets. Using existing data overcomes logistical barriers, and it also reduces costs, time, travel and the survey fatigue experienced by many communities.

At KIT, we know the value of existing datasets first-hand. Over the years, we collected over 50,000 unique observations from agricultural households across 26 countries. When clients started coming to us with additional questions about their projects, we often realised that we already had the data to provide them with the information they needed. We learned that even though surveys might not have been designed for it, combining disparate results could still provide a wealth of information. Datasets can also be compared, cross-analysed and data and conclusions triangulated and validated. This way, we continue learning.

Four things you can do with your existing data

Problem analysis helps you to dive into the issues that impact the performance of your activities. Is food security a problem? Why? Who is food insecure? What are the gender dynamics impacting project participation? What is the wealth situation of your target group? How diverse are income-generating activities of participants? Understanding such dynamics will enable you to better tailor your future activities to your project goals.

Segmentation analysis of your targeted population helps in customising your approach to specific needs. Which households face which issues? What are the specific characteristics of your targeted households? Segmentation is also useful to improve programme design and tailor activities to the needs of specific groups. Are you planning COVID-19- specific activities? Segmentation can also help you tailor your response.

Validating and testing underlying assumptions of the theory of change or intervention logic based on project data can support the adaptation of your intervention strategy during implementation. A good example of this is: “does higher income translate into food security?”. Many projects have this ambition as a project goal, but are activities translating into meaningful gains? Diving into your data can help you answer such questions.

Data sharing prevents similar data from being collected multiple times. Now, more than ever, it is important to make use of scarce resources, and the demand for transparency is higher than ever. The information needed to answer many of the questions you have can likely be extracted from other existing datasets on the topic. Discussions with like-minded organisations or answering research questions with each other contributes enormously to learning and collaboration and ensures the optimal use of data.

Are we saying that we should never again collect data? No!

New data provides us with important up-to-date information. But, we argue, if we have questions we still want to answer, we should look carefully at all of the existing information we have and see if the answers lie within. Especially now, where an evidence-based COVID-19 response is needed and time is of the essence, there is a need to focus on the gaps and not try to reinvent the wheel. Not only does this save precious resources and overcome (temporary) practical challenges. It also respects the many, many survey respondents who have given us their time and insight over the years.

Shall we do some digging?

About the author: Esther Smits is an interdisciplinary social scientist with a focus on environmental and agricultural development. She holds an MSc in Development Economics from Wageningen University & Research and specialises in impact evaluation research. She has experience with quantitative as well as qualitative impact evaluation methods, which she likes to combine in mixed-methods designs.