Resilience: reference measures based on longer-term consequences are needed to unlock the potential of precision livestock farming technologies for quantifying this trait

selection of animals based on both the phenotype and the genotype (via genomic selection). Here we develop a rationale for how the long-term consequences of resilience can be assessed in dairy cows, and how this allows to evaluate and benchmark PLF-based proxies. To this end, we start from the assumption that resilience mechanisms must provide longer-term benefits that can be measured at the end of life, on which animals can be ranked within farm. Those benefits may include increased productive lifespan and reduced frequency of health and reproductive problems. The relative resilience rank, that implicitly takes the farm environment into account, can be used as a reference value for validating new measures of resilience. Three obvious candidates for a relative resilience rank are productive lifespan, the number of disease events, and the ability to re-calve.


Introduction
Climate change, with its increasing frequency of environmental disturbances (Hansen et al., 2012), and the growing societal demands for sustainable food production, put pressure on the livestock sector. To deal with these pressures, more complex traits such as resilience must be considered in our management strategies and in our breeding programs. Resilient animals respond well to environmental disturbances, and have a decreased probability of needing assistance to overcome the environmental challenge. They therefore have better welfare and health.
Resilience is increasingly seen as an important trait that has a key role in sustainable livestock systems (ATF, 2018). It is a trait that is important to farmers and other livestock stakeholders, who value animals that take care of themselves, that bounce back quickly upon challenges, and stay in the herd for a long time (Ollion et al., 2018). The opportunities to breed for it and the economic benefits of doing so have been reported (Berghof et al., 2019). There is considerable interest within the livestock industry in implementing genetic selection and genomic management strategies that favour resilience (Knap and Doeschl-Wilson, 2020), which requires a clearer understanding of what resilience means and quantification of this trait. Accordingly, this paper consists of two main parts: First, the concept and possibilities of measuring resilience are discussed in a more general way. Secondly, the development of operational measures and their validation are demonstrated using a dairy cow example.
The first part, concerning the concept and possibilities of measuring resilience, tackles three key issues that are: (1) what is resilience, and how it relates to notions of e.g. robustness or disease resistance presented in the past, and why its direct quantification is difficult; (2) what are its long-term consequences, and can these be used to assess resilience of animals on farm, and at population level, and (3) how precision livestock farming (PLF) technologies, and the associated time-series measures, provide opportunities for quantification of this trait. It should be noted that in this paper we consider resilience only at the level of the animal, and not at the farm systems level, which is associated with other driving factors (e.g. Muñoz-Ulecia et al., 2020).

What is resilience?
That resilience is an important concept in animal science is demonstrated by the plethora of papers published in the past decades (see Berghof et al., 2019;Colditz and Hine, 2016). When evaluating the past literature, it becomes clear that defining what resilience exactly is, is hard, and depends on the context in which it is evaluated. However, the general concept is rather well agreed upon, and comes down to "resilience is the capacity of an animal to respond to environmental disturbances", which explicitly focuses on the reaction of the animal to variations in its environment (Scheffer et al., 2018). It implies abilities to absorb an environmental challenge through buffering mechanisms (metabolic and behavioural), and to temporarily modulate the allocation of available resources to life functions, down prioritizing those that are non-vital, and up-prioritizing those that are needed to meet the challenge. As such, resilience is conceptually similar to the notion of homeostasis in physiology and, from an evolutionary point-of-view, results from the fitness benefit of being able to overcome environmental challenges (relative to the costs of having resilience mechanisms). As will become clear in what follows, it is useful to keep in mind that whilst resilience can provide the animal with benefits, in some environments these may not outweigh the (energetic and entropy) cost of having resilience mechanisms, and that therefore different types of environments may favour or penalise selection for resilience (Strandberg 2009). One recent literature definition found for resilience include "the ability of an animal to be minimally affected in its functioning by an external disturbance, or to quickly return to the state that it had before the challenge" (Colditz and Hine, 2016, among others). In this definition, the 'state' does not merely refer to the productive state only, but can also be the physiological, behavioural, cognitive or health state.
When evaluating the definition of Colditz and Hine (2016), it becomes apparent that there is a potential confusion between different terminologies relating to complex traits describing how animals cope with certain stressors. More specifically, the confusion exists between robustness and resilience in the animal science literature. This confusion is not merely academic, and we believe that it is important to clearly distinguish the two as they represent different animal capacities that are valuable in the different types of environments that are displayed in Figure 1. Schematically we can consider environments to be 'on average' good or poor, and we can consider that they can be stable or variable (Figure 1).

Figure 1:
Schematic representation of environment quality (y-axis in the insert graphs) through time (x-axis in the insert graphs), the solid red lines represent the environmental quality (the dotted line shows the same reference level for the 4 insert graphs). The four combinations of environments that are on average good or poor, and stable or variable are shown. The need for animal robustness and/or resilience is indicated for each case.
As stated above, resilience is the ability of an animal to 'bounce back' from a disturbance, which implicitly is of relatively short duration. This implies that resilient animals are needed when the environment is variable. Robustness, however, is the ability to adjust the general level of performance to fit the average environmental quality, and in particular to fit to constraining environments. This is typically achieved by finding an allocation of resources to different life functions that is sustainable in the long term , Roff et al., 2002. As such, robustness is conceptually similar to the notion of homeorhesis in physiology (Bauman and Currie, 1980). Accordingly, most definitions of robustness emphasize the capability to cope with environments that are unfavourable for a long time. A good and stable environment requires very little resilience or robustness, whereas a poor and variable environment requires both resilience and robustness. For resilience, elasticity in performance on key parameters is therefore more important than their levels, whereas for robustness the opposite is the case. The popular analogy for this is the reed, a plant that bends easily when there is a gust of wind, springing back up once the gust has passed. The popular analogy for robustness is the oak tree which does not bend but resists the wind (this analogy by La Fontaine (1668) is captured in the painting "The oak and the reed" by Achille Michallon, 1816). The distinctions between resilience and robustness, and the ways in which they are linked are discussed in more detail in Friggens et al. (2017) and Strandberg (2009). It seems likely that for animals to thrive in different types of environments they will require different mixes of robustness and resilience. It has been shown that there is significant genetic variation in these traits, even after adjustment for production level (Rostellato et al., 2021), which suggests that selection of the optimal blend of resilience and robustness is feasible.
An additional complexity to these notions becomes evident when considering the type of environmental challenge being faced by the animal (see Figure 2). It is not expected that animals respond equally well to different environmental disturbances (Savietto et al., 2015). This is important from the perspective of the underlying biological mechanisms (Gross and Bruckmaier 2019). For example, it is unlikely that there is a full overlap between the mechanisms needed to respond appropriately to a disease challenge and those needed to respond to a nutritional or climate-related challenge.

Figure 2:
A non-exhaustive representation of environmental factors causing challenges for the animal. In this representation, challenges are visualized as perturbations (green) in the milk yield curve of a given cow.
More specifically in animal sciences, resilience often refers to 'disease resilience', and relates to pathogen/health challenges only. In this context, it is important to also disentangle resilience from the terms resistance and tolerance, as they cover different aspects of how an animal deals with infections. Doeschl-Wilson et al. (2012) and Boulton et al. (2018) provided the following definitions in this context: 'resistance' is an animal's response to decrease the pathogen burden; 'tolerance' is an animal's performance level at an increasing pathogen burden; and 'resilience' refers to an animal's ability to maintain its production level during an infection. Based on the considerations outlined above, however, the latter definition refers more to the concept of robustness than to the concept of resilience.
It becomes very easy to get bogged down in these definitions; not least because it is well established, from a biological systems perspective, that the different properties (resilience, robustness, resistance, tolerance) are intertwined at different levels. For example, robustness (e.g. stability of milk production) is frequently the consequence of resilience at the underlying level (e.g. flexibility in the physiological mechanisms in the mammary gland, Gross and Bruckmaier (2019)).
Whilst there is a general agreement on the need for resilient animals, and on the broad concepts, it is clear from the above that there is no single direct measure of resilience. And yet, this is needed when we want to breed and monitor our livestock for it. Additionally, the considerations raised when discussing what resilience is (and is not), lead to several questions that are important when seeking to quantify resilience in livestock animals: -Is it possible, despite the complexity of this trait, to get enough grip on the notion of resilience to come up with useful quantification measures independent of the specific definition? -To this end, should resilience to specific types of challenges be considered or can we aim to phenotype animals that are 'generally' resilient to all types of challenges? -What tools do we have available to quantify resilience, and how can these be evaluated?
The rest of this paper is devoted to exploring how quantification of resilience is possible and how candidate resilience indicators can be evaluated. We consider this to be the key step towards having widely applicable operational methods for assessing animal resilience. The position taken with regards to the second question is that a generic definition is possible regardless of the challenge type or biological level considered (Scheffer et al., 2018). Still, when characterizing differences between animals in resilience the measures used will probably depend upon the type of disturbance being considered.

Long-term consequences of resilience
The question posed above "Is it possible, despite the complexity of this trait, to get enough grip on the notion of resilience to come up with useful quantification measures?" can be answered by considering the long-term benefits of resilience. All other things being equal, we expect that a resilient animal will live longer than a less resilient animal. In other words, we expect the benefits of resilience to accumulate, not least because biologically it is the reason for which resilience mechanisms evolved in the first place, i.e. resilience is the capacity of an animal to respond to environmental disturbances and thus safeguard its future ability to contribute genes to the next generation.
Additionally, consideration of the long-term consequences of resilience allows us to circumvent the above mentioned difficulties to find consensus on resilience definitions per se, as it builds on the assumption that differences in resilience between animals (or genotypes) will reveal themselves over time.
The arguments for this are as follows: 1) a more resilient animal is more likely to overcome any given environmental disturbance than a less resilient animal; 2) it is also more likely to overcome subsequent environmental disturbances; 3) over sufficiently long periods of time animals in the same environment will have been exposed to similar numbers of challenges; and 4) thus over sufficiently long periods of time the probability of survival of the more resilient animal will be greater than that of the less resilient animal. These considerations highlight a key point: even if resilience mechanisms function on a short timescale in response to a disturbance, they can only be judged to be favourable in the long term. Given this, any proposed operational definition of resilience based on short-term measures will need to be validated against long-term 'accumulated consequence' measures.
The most obvious measure of the long-term, accumulated benefits of resilience is lifespan, but this is not an easy measure to use for a number of reasons. Clearly, from the perspective of monitoring animal resilience in an on-farm management context it cannot be directly used since phenotypically it is only known once the animal has died. This does not preclude its use for quantifying more directly measurable resilience proxies (or for genomic evaluation), provided that sufficiently large historical data sets exist in which both longevity and the short-term resilience measures have been included. However, this should be approached with caution: typically these long-term consequence measures are heavily influenced by factors other than resilience, i.e. production parameters, farm management and harshness of the local production environment (Ahlman et al., 2011), and as such they should, strictly speaking, only be used to compare animals kept in the same environment. Further, they do not provide any information on the underlying nature of the resilience, which may be important when considering resilience to specific types of environmental disturbances or when addressing the question "can we aim to phenotype animals that are 'generally' resilient to all types of challenges?". Nevertheless, productive lifespan data can be, and has been, used in genetic analyses to identify genotypes that survive longer and are thus a priori more resilient (Rostellato et al., 2021;Tarrés et al., 2006). The challenge lies in adequately accounting for the changing (farm) environments in which this productive life is realized, including aspects of e.g. replacement animal management. Such studies show why resilience is needed, and they also demonstrate the value of considering the long-term consequences of resilience, and thus the value of such data for developing other more operational measures of resilience.

Towards operational measures of resilience
Measuring resilience, that is measuring the ability of an animal to 'bounce back' from a disturbance, requires time-series data. It is only with time-series data that one can quantify the extent to which an environmental disturbance elicits a response from the animal, and the rate at which it subsequently recovers. As shown in Figure 3, even if animals take the same total time to 'bounce back' they can have quite different rates of response and recovery (see also Figures 4 and 5 in Ben Abdelkrim et al., 2021b). Acquiring the necessary time series to capture these profiles of response and recovery requires intensive recording, which is difficult to achieve manually leading to there being relatively few challenge experiments until recently (Friggens et al., 2016). The advent of precision livestock farming technologies (PLF) and their increased adoption on large numbers of farms (Lora et al., 2020) offers great promise in this context. They provide continuous streams of information at individual animal level from different combinations of performance and behaviour such as: milk production, activity, rumination, body temperature, feed intake, live weight, and body condition score, as well as some markers of key physiological parameters for disease and reproduction (e.g. Herd Navigator, Friggens et al. (2010)). Because these technologies generate highfrequency time series of data within a specific farm context, they are ideal for capturing response and recovery profiles to environmental disturbances and thereby developing new resilience indicators (Adriaens et al., 2020, Ben Abdelkrim et al., 2021a, Poppe et al., 2020, Grodkowski et al., 2018. Precision Livestock Farming technologies (i.e. continuous low-cost high-frequency measures) can thus be used as a means to measure resilience in commercial populations on a sufficiently large scale to provide phenotypes for genetic evaluation (Poppe et al., 2020). However, in order to use these data, intelligent and automated ways for interpretation are needed. Several generic options emerge for moving from raw PLF data towards operational measures: -Use the within-animal variance associated with a relevant time-series of measures, which needs to be of sufficiently high frequency to capture the variance due to disturbances. Examples of this approach are provided by Elgersma et al. (2018) and Poppe et al. (2020Poppe et al. ( , 2021 for milk yield of dairy cattle, and Putz et al. (2019) for feed intake of growing-finishing pigs. -Identify the frequency of events, i.e. the number of times an animal deviates from the expected time series which is based on its own individual trajectory. This relies on setting up criteria for defining a deviation relative to a baseline, examples of this have been provided by Codrea et al. (2011) and Adriaens et al. (2020Adriaens et al. ( , 2021a using milk yield as an example. This is also the approach inherent in accelerometer-based oestrus detection systems in cattle. -Characterise for each deviation event the amplitude of response and rate of recovery relative to the expected time series, as well as other 'shape' parameters. Methods for doing this range from purely statistical approaches, e.g. using differential smoothing with B-splines (Codrea et al., 2011), to models that define a baseline function from biological principles ( , and even to the use of elasticity models (Sadoul et al., 2015).
Approaches such as that of Adriaens et al. (2020) combine all of the above options. In order to develop those resilience measures, each option should be calibrated and validated against established resilience indicators, ideally in the short, medium and long term. It is one thing to be able to quantify perturbations, but quite another to give them a resilience value. Which of the response-recovery profiles shown in Figure 3 is best, can only be judged by looking at their consequences on longer-term outcomes. In an animal setting, these longer-term outcomes can include the number of disease events, the ability to reproduce, and ultimately the lifespan. Thus, for a candidate measure to be validated as a practical resilience indicator it must be shown to be positively correlated to the accumulated benefits to the animal of better resilience. It should also be remembered that whilst we seek generic methods for distilling PLF data into resilience indicators, the choice of PLF data sources will influence the extent to which different types of challenges are captured. We can, for example, expect time-series measures of somatic cell count to be a good indicator of how a cow deals with a challenge posed by mastitis pathogens, but should not be surprised that it does not pick up nutritional challenges. Further, the interpretation of amplitudes of response and rates of recovery may well be perturbation-type specific, as a big drop in intake is probably a reliable short term indicator when facing heat stress, but will only be observed after a longer exposure when responding to poor feed quality. It is also unlikely that any one single measure captures all of the particular types of resilience, however, with a multivariate analysis, several PLF information sources could be combined in order to extract and combine different facets of resilience (e.g. Friggens, 2010, Adriaens et al., 2020). Not all deviations from normal sensor levels will be informative about resilience. Therefore, it is expected that combining data on multiple functional traits, such as both production and behavioural activity level, might give a more accurate reflection of resilience (Scheffer et al., 2018). There is recent evidence to support this (Sadoul et al., 2017, Mendes et al., 2021.

Case study: towards operational measures of resilience in dairy cows
The rest of this paper focuses on how the approach described above can be applied in practice using a dairy cow example. The general scheme for doing this is shown in Figure 4. Resilience in dairy cows is important, as it impacts on many aspects of dairy production sustainability. For example, cows that cope well with the challenges posed, will produce and reproduce well, which decreases the workload for the farmer, optimizes the input-output balance in terms of nutrients and milk production, and induce a better welfare. Assessing resilience in practice is important, as it provides a measure of how a cow copes with her specific farm disturbances in the environment over time and allows selection of animals based on both the phenotype and the genotype (via genomic selection). Here we develop a rationale for how the long-term consequences of resilience can be assessed in dairy cows, and how this allows to evaluate and benchmark PLF-based proxies. To this end, we start from the assumption that resilience mechanisms must provide longer-term benefits that can be measured at the end of life, on which animals can be ranked within farm. Those benefits may include increased productive lifespan and reduced frequency of health and reproductive problems. The relative resilience rank, that implicitly takes the farm environment into account, can be used as a reference value for validating new measures of resilience. Three obvious candidates for a relative resilience rank are productive lifespan, the number of disease events, and the ability to re-calve.

Figure 4:
Schematic representation of the process for developing operational proxies of resilience that have been validated against reference measures of the long-term, accumulated, consequences of differences in resilience.

Productive lifespan
A recent review by Schuster et al. (2020) has described the progressive decline of cows' productive lifespan in most intensive dairy systems. Currently, in such systems, the age at culling of dairy cows lies between 4.5 and 6 years (De Vries and Marcondes, 2020). Although it is not the only factor affecting it, it is expected that resilience positively affects productive lifespan (defined as time from first calving to culling). Other aspects also playing an important role include milk production level and robustness, depending on farm management and farmers' preferences (replacement rate and young stock, culling policy, nutrition, disease pressure, market forces, etc.). This indicates that if productive lifespan is to be used as a reference measure of resilience it should be adjusted for 1) herd, year, and season, 2) milk production level, and 3) measurable quantities reflecting robustness such as the number of disease events or body weight (Rostellato et al., 2021). Sticking to the assumption that robustness is about modulation of the partition between life functions to deal with stable differences in environment, it is proposed that body weight and body condition score (which allow us to deduce size and fatness, Friggens et al. (2007)) can be used to reflect the element of robustness that is not captured by milk production level. In other words, as a function, productive lifespan depends on: herd-year-season, body weight (BW), body condition score, milk yield and a residual. The residual includes the relative differences in resilience between individual animals, and the average values per animal can then be used to rank animals on their relative productive lifespan (i.e. adjusted to the average of the herd, and the other factors in the function). There are promising approaches for this that have recently been published e.g. Dunne et al. (2018), and it has been applied to quantify between farm differences in variability in milk yield (Poppe et al., 2021). This adjusted productive lifespan should give a reference measure for resilience, but one that requires considerable data depth in terms of detail and sources. For it to be useful for the validation of new measures of resilience, it requires that sufficient numbers of historical records of lifespan are available where the potentially new measures of resilience were also being made historically on these animals. This clearly is a significant constraint on the use of productive lifespan as a reference measure.

Number of disease events
Another approach would be to use the number of disease events in a sufficiently long time period, e.g. from calving to calving. If the time period is too short, then the absence of health and reproductive problems could simply be due to the absence of environmental disturbances in that short period. Thus, the measures need to be made over a timespan that is sufficient to ensure that the differences between animals are attributable to their resilience rather than to chance due to a limited frequency of challenges encountered.
Unfortunately, the accurate recording of number of disease events over a sufficiently long period is complicated to achieve on commercial farms (Adriaens et al., 2021b), and it is hard to argue that it would capture resilience to environmental disturbances such as a heat wave or nutritional challenges. In other words, this is a reference measure more suited for resilience to health challenges. The most straightforward way to assess the number of disease events would be to rely on treatment registrations, typically recorded by the farmer. This proves quite complicated, as there are issues such as defining which diseases are not resilience related (i.e. preventive events (hoof trimming), accidents (broken leg) etc.). Similarly, there are different (clusters of) diseases; e.g. metabolic diseases, respiratory diseases, reproductive disorders, infectious diseases (acute and chronic), all assumed to have a different effect on, or relation with resilience. This poses a number of challenges in creating a gold standard resilience ranking: should disease events just be counted, or should they be weighted by their severity, and if so how? And, when is a disease case a new case? This additionally raises the issue of time-windows for dealing with repeated disease detections. These questions are common to all work being carried out to evaluate new methods for disease detection (van Dixhoorn et al., 2018;Kamphuis et al., 2013), and it has been shown that the reported precision of such methods (typically reported specificities and sensitivities) are highly sensitive to how these issues are treated Kamphuis et al., 2010).
When the number of disease events is being used to provide a reference measure to benchmark resilience with, disease detection must be done using methods that are independent of the PLF technologies being used to create the resilience proxy. This generally means relying on veterinary or farmer observations, and it has been shown that some events will not be recorded (Adriaens et al., 2021b). This can therefore lead to the situation where the reference measure is less accurate than the new measure being tested, which requires relatively sophisticated approaches to resolve. Whilst keeping in mind the above issues, high quality and consistent disease recording, coupled with good cross-validation, can provide a reference measure against which to evaluate resilience proxies.

Ability to re-calve
In the context where it is not reasonable to use productive lifespan (i.e. because relevant datasets are not available or when within-lifetime evaluations are wanted), an easier reference measure for resilience could be the ability of a cow to re-calve (Adriaens et al., 2020). The term "ability to re-calve" is preferred because it makes explicit that we are looking to establish how prior resilience measures affect future reproductive performance. The advantage is that this information is available in each parity rather than at just one end-point, making it a trait that is easier to select for in breeding programmes. The assumption here is that poorer resilience will negatively impact reproductive performance. This is expected to hold for all types of environmental disturbances. However, as with productive lifespan, it is not expected that variability in ability to re-calve solely comes from variability in resilience, so adjustment is needed to this measure, and not just for management factors such as length of voluntary waiting period. It is well established that the probability of conception is negatively affected by, e.g. a high peak milk yield or by a too high mobilisation of body reserves in early lactation, i.e. body weight loss from calving to nadir. By analogy with the arguments put forward for adjusting productive lifespan, this latter effect is seen as being more to do with robustness than resilience. Thus, appropriately adjusted ability to re-calve should provide a reference measure for resilience suitable for evaluating precision livestock technology-derived proxies.
The case study presented here uses ability to re-calve as the reference measure against which different resilience proxy measures could be evaluated. This study was carried out in a multi-partner European project (GenTORE; www.gentore.eu) and the aim was to build a resilience ranking that could be readily replicated in the different countries, which could then be used to evaluate different measures of shortterm resilience. The resilience rank was developed via a series of discussions among experts and then evaluated using observational data from dairy herds. As the data were purely records of normal farm practise with no intervention, ethical approval for this work was not needed. The basis for the resilience ranking equation is the number of re-calvings, and within each lactation, bonus or penalty points are added or subtracted that allow distinguishing the resilience between animals with the same parity number. These bonus or penalty points are for traits assumed to link to resilience that both reflect the animals' ability to re-calve and that are readily available, or could reasonably be expected to be recorded on commercial farms and include: -number of inseminations: bonus points will be assigned for every first insemination during a lactation. Penalty points will be subtracted for each additional insemination, where each additional insemination will receive an increasing amount of minus points. -305-yield as estimated via national milk production registrations: a cow producing above average of peers in the herd (same parity) will receive bonus points for each % above the peer average. Minus points will be given for each % below peer average. -number of curative treatment events: for each curative treatment day, penalty points will be given for each treatment day caused by a health event. This does not account for preventive treatments like hoof trimming. -calving interval: bonus points will be assigned to each day of shorter interval compared to herdpeer average. Penalty points will be given for each day of calving interval above herd-peer average.
The above list provides a relative resilience rank within parity, it can then be extended to give a relative resilience rank across lactations by including the following: age at first calving: this will be compared to the herd average. Bonus points will be provided when age at first calving (expressed in days) is lower than the herd average. Minus points will be given in the opposite case.
This is effectively a lifetime resilience ranking score, it is analogous to a multi-trait index in genetic evaluation but one that is available at any time-point in an animals life. One can come up with additional corrections as well, such as calving ease, feed efficiency or relevant herd factors and culling reasons. However, as these are less readily available, we did not include them in our current ranking equation.
The combination of these terms renders an equation that allows ranking animals on a specific farm for their resilience based on concretely measurable and available parameters. Still, it is important to determine how the different composing elements are weighted in order to come to one 'ability to re-calve' gold standard. A pragmatic balance will need to be made between these different traits based on consideration of the relative value of the different components. For instance, should we allow the points deducted for curative treatment days to be sufficiently high so that a second parity cow with e.g. 10 treatment days has a lower gold standard value than a first parity cow that has no recorded health events? This consideration should also take into account the ranges in the different component traits as well as the reliability of them. Ideally, the weights to be used could be informed by population level analyses that have quantified the hazard ratios for these different factors in epidemiological and survival analyses (e.g. Rostellato et al., 2021). Examples of relative resilience rankings as proposed by six partners of GenTORE, including the weightings used, are given in Table 1.
With one of the weightings in Table 1 (scheme 'B'), applied to a Dutch single-herd dataset with 1800 cows for which the culling date was known (Ouweltjes et al., 2019), we showed that cows that reached the next parity had a higher resilience score than cows that were culled ( Table 2). Data of the Table 2 show that cows that did not reach their second parity, scored on average 412 points, which is far lower than the 500 points they started the parity with. This could be due to lower milk production, higher age at first calving, and/or more curative treatment days. Cows that reached the 6 th or higher parity, already scored a high resilience score in their first parity (502 points), showing that with this resilience ranking score it is possible to differentiate between cows that have a high ability to re-calve (and therefore live long and have a high productive life span) and cows that have a low(er) ability to re-calve, already in their first parity. Table 1. Resilience weights applied in a multi-site study by six partners in GenTORE (https://www.gentore.eu/wp31.html). An example of the calculation of a resilience score using this approach is given in Adriaens et al (2020).    Table 2. The total resilience score per parity, as the sum of cows that survived to the next parity and cows that were culled, calculated using weighting scheme 'B' from Table 1. In order to examine the usefulness of such a reference measure that reflects the ability to re-calve (which is our gold standard), we provide examples of the relation between milk yield deviations, and also accelerometer-derived deviations, and this gold standard. The choice of these two PLF technology-derived measures was based on several points. Sensors for recording individual milk yield are the most prevalent type of PLF technology installed in dairy herds (Lora et al., 2020) and the variation in milk yield has shown to be genetically correlated with increased longevity (Elgersma et al., 2018, Poppe et al., 2020. Moreover, accelerometers are widely adopted for housed cattle, and deviations derived from them have been associated with increased probability of disease (e.g. de Mol et al., 2013). Accelerometers have also been shown to readily detect changes in behavior (feeding, ruminating, general activity) in cows at pasture (Yin et al., 2019). Adriaens et al. (2020) showed that a resilience ranking based on the above practical definition of ability to re-calve can be predicted with sensor data derived from milk yield and/or accelerometers. In their study cows with a high resilience ranking (high lifetime resilience score), and thus cows assumed to be resilient, had fewer drops in their milk yield and also had more stable activity dynamics (both fewer spikes in daily activity, indicating oestrus events, and less drops in daily activity related to poor locomotion and other health events). However, this study also showed that there were farm-to-farm differences in the prediction equations. These differences underline the importance of the local production environments (including farm management and culling reasons) on the probability of a cow to stay in the herd. Thus, whilst it seems that the adjustment for the non-resilience factors that affect productive lifespan and resilience rank is sufficient for deriving breeding values and genetic correlations using genetic evaluations on large populations (Poppe et al., 2020, Rostellato et al., 2021), this appears not to be the case for the use of resilience proxies based on PLF technologies for on-farm management. In this latter case, although results are promising, a better accounting for the local production environment is needed to explain the non-resilience factors that affect the reference measures of a resilience rank. In this context, promising methods to use animal responses themselves (collectively in a herd context) for detecting when an environment is challenging have recently been proposed (Garcia-Baccino et al., 2021).

Concluding remarks
This paper discussed the need for operational measures of resilience that can be deployed on large scale across different farm types. Such measures are needed to provide more precise phenotypes of resilience for use in farm management, but also for use in animal breeding. Such operational measures should provide a window on the local production environment that gives an evidence base for breeding and culling decisions that can be applied by the farmer, and that take into account what is expressed by the animals in their specific environments.
It is clear from the above discussion that any universal definition of resilience will be too broad to be operational. Indeed, resilience should be seen as a latent construct, i.e. it cannot be directly measured, not least because any measure of response and recovery reflects both the animals resilience and the perceived size of the environmental disturbance, which can vary over time, depending on multiple animal and farmrelated contexts. This leads to the following two points: 1. Any postulated operational measure of resilience to challenge should be constructed from a sufficient number of indicators that each individually capture different facets of the resilience, such that when combined they better reflect the full resilience response.
2. Any postulated operational measure of resilience to challenge will have to be validated against reference measures that are not "resilience", but rather the accumulated consequences of good resilience (i.e. productive lifespan, ability to re-calve, etc.).
Clearly, the choice of indicators used, and to some extent the reference measures, will depend upon whether one seeks to develop general resilience measures or to develop measures of resilience to a specific type of challenge. In this paper, we highlight the potential of PLF technologies to provide operational measures. We also provide a rationale and example of how to construct a reference ranking based on accumulated long-term consequences of resilience for evaluating new resilience proxies. In the absence of clear gold standards (and simple direct measures of resilience) the construction of resilience proxies and reference measures will proceed by using the principles of construct validation. These involve a search for convergence across independent measures of the same conceptually related construct, and the principle of discriminant validation, i.e. a search for divergence across independent measures of different (conceptually unrelated) constructs (described for animal science by Waiblinger et al., 2006). It seems evident that progress towards operational definitions of resilience will emerge from focused stepwise data analyses, and not from isolated discussions of concepts. Further, to achieve validity across widely different farm types, this will need to be extended to include the locally-relevant economic, breeding, and management contexts. Nevertheless, the first studies (of which many discussed above) taking the operational approach to quantifying and defining resilience show the promise of a data-driven approach when applied within a biologically logical framework.

Data accessibility
No dedicated scripts or data were used in this study (the example data in Table 2 are simple means of descriptive data for cows (no. of lactations, no. of inseminations, 305d MY, etc) that have no intrinsic value as such).