Developing adaptive desgins for longitudinal studies

One of the biggest challenges in survey research is insuring the participation of respondents. This is especially important in longitudinal studies as cumulative non-response can significantly decrease response rates and could potentially be selective, leading to non-response bias.

Researchers are exploring different solutions to this issue such as optimizing the use of incentives or using multiple modes of interviewing (e.g., face to face, web, telephone). One avenue that has received attention recently are also adaptive designs. This strategy implies dividing the data collection in different stages and targeting some case with interventions (such as extra effort or larger incentives) to maximize response rates or to minimize selection bias.

This strategy has received considerable attention in recent years as it could lead to a more efficient use of funds without increasing the burden on respondents. Nevertheless, this approach has yet to receive considerable attention in the context of longitudinal studies although the wealth of information available from prior waves could potentially lead to efficient designs.

In a recent paper published in the Journal of Survey Statistics and Methodology we explore what would happen if we used such an adaptive design in two leading panel studies in the UK and Australia: Understanding Society and Household, Income and Labour Dynamics in Australia (HILDA). More precisely, we divided the fieldwork in two stages: in the first stage approximately 90% of the cases were interviewed while in the second stage the remaining 10% took part. We investigated what happens if effort in the latter stage is reduced by 25% (for example due to budget constraints).

We explored four strategies for targeting the effort in this last stage of data collection:

follow-up individuals most likely to improve sample balance (Sim1 Rind). This should help minimize non-response bias.
follow-up individuals most likely to respond during follow-up (Sim2 Resp). This should help maximize response rates and costs saved.
combine the first two measures by separately ranking cases according to their response probabilities (Sim3 Rank)
combine the first two measures by summing their participation likelihoods (Sim 4 Sum)

We simulated these strategies over multiple waves of Understanding Society and HILDA to see the long term effects of pursuing them. For cases that were not selected based on each strategy we changed the outcomes to non-response (regardless of true outcome) while for those that were targeted we kept their observed response outcomes.

To understand how the different strategies impact outcomes we looked at four data quality indicators: R indicator scores (a proxy for selectivity in the sample, larger is better), average response rates, five year response rates and proportion of calls saved.

Our results show that the strategy that targets underrepresented individuals (Sim1 Rind) manages to do that to some degree, leading to an improvement in R indicators even compared to the full sample. This strategy does lead to a decrease in response rates but saves around 12%/22% of contact attempts. The strategy the aims to maximize response rates (Sim2 Resp) has a worse R indicator but response rates are slightly better and saves 18.5%/24.4% of contact attempts. The last two strategies represent a compromise between these two.

	Current practice	Sim1 Rind	Sim2 Resp	Sim3 Rank	Sim4 Sum
HILDA Survey main sample
Ave R-ind	0.936	+0.036	-0.029	+0.005	-0.015
Ave Resp	92.0%	-4.4%	-4.1%	-4.3%	-4.1%
5-year Resp	86.4%	-12.6%	-11.8%	-12.6%	-12.0%
Cost (% calls)		-12.4%	-18.5%	-14.6%	-15.7%
UKHLS sample
Ave R-ind	0.756	+0.055	-0.039	+0.013	+0.003
Ave Resp	59.5%	-4.0%	-2.9%	-3.4%	-3.3%
5-year Resp	44.8%	-8.8%	-5.8%	-7.4%	-7.8%
Cost (% calls)		-22.2%	-24.4%	-22.6%	-23.4%

Survey outcomes based on different adaptive design strategies in Understanding Society and HILDA

Of the four simulated methods to reduce the follow-up field effort by 25%, the method that provides the best compromise between sample balance and response rates is Sim3 Rank where the best ranked cases to improve response rates or sample balance were issued to follow-up. This strategy results in a sample balance that is similar to what is achieved under the current practice but comes at the cost of around 3–6% points in the response rate each wave and around 8–13% point drop in the full balanced panel response rate after four waves. Admittedly this drop in response is uncomfortable and potentially unpalatable, but it does demonstrate the consequences of reduced funding and associated fieldwork effort in a longitudinal household panel study.

This work highlights the complex trade-offs involved in targeting cases. While we cannot improve all the outcomes concurrently adaptive designs do give us the tools to make these decisions explicit.

To find out more about this research have a look at the full paper in the Journal of Survey Statistics and Methodology.