Why evaluate marine protected areas? First of all because resources are limited with any management program and any management strategy comes with costs. When you allocate funds to one management strategy relative to others, it detracts from alternative strategies, redirecting money and time that way. If that turns out to be using an ineffective management strategy and you've failed to evaluate it and recognize it as ineffective, then you've jeopardized the resource, especially if at the same time you were to relax other regulatory activities. And a consequence of that is you've provided this false sense of security for conservation and management of the resource. On the other hand, the sooner the benefits or costs are determined, the more rapidly you can spin up that strategy. If you evaluate it and find out that it's working successfully, then it makes a lot of sense to redirect funds rapidly in that direction. Or if it doesn't work then you compensate somehow by changing the design or you bail out of it entirely.
And of course evaluation is fundamental for adaptive management, which should evolve over time and refine one's design.
The final reason--and possibly the most important--is that it is the law now. There are sunset closures in many of the protected areas that have been established, and they are required by law to demonstrate their effectiveness within some certain period or they will no longer exist. This is pretty serious.
If you did a good job of designing reserves and you properly evaluate them then you know at that point whether they're working or not. On the other hand, if you've done a poor job of designing them but have done a good job of evaluating, then you're drawn to the conclusion that these things may not work very well.
Likewise, even if you've done a good job of designing reserves but do a poor job of evaluating their effectiveness, then they may be surpassing the goals you set but you can't detect it. Of course the worst situation is if you've done a poor job of each.
There are three things I want to emphasize with respect to evaluation. First of all, we're talking about effectiveness, not effects. You set reserves up for a specific purpose. People have bought into these because you've said they're going to do certain things, and those are exactly the kind of criteria you're going to have to evaluate it on. Not just general effects, but their effectiveness of meeting the goals that they were designed for. The other thing is we're trying to ascribe causality to reserves. When you compare inside and outside reserves, you want to get to the point where you can say the differences are in fact because we've made them protected areas and decouple that from any site effects or other confounding variables.
The problem is that we find ourselves in a situation of moving from boatloads of fishes to recognizing now that we have boatloads of uncertainty. There are four particular sources of uncertainty when it comes to evaluating marine protected areas. One is referred to as causal uncertainty, which is the effect the MPA has on the community that you're trying to protect. The problem here is your inability to ascribe causality because of all these other confounding factors. There's always causal uncertainty in any kind of experiment. Natural variation and natural processes are also influencing the community, which makes estimates more difficult and uncertain. When you go out to try to measure how the community is responding to an MPA, there is measurement uncertainty. And then after you've collected the data and you're in the process of analysis and interpretation, the analyses and the interpretations are based on models--either statistical or conceptual models--of how you expected that system to respond, and they may be wrong as well.
 Mark Carr | The four sources of uncertainty when evaluating marine protected areas. |
Goals and effectiveness
The fundamental components are first creating the objectives, or goals. Those particular objectives define a particular effectiveness parameter, or a response variable, that you're going to use to ascertain the effectiveness of the reserve. It can be a component, like the abundance of a certain species such as a keystone species, a targeted exploited species or a rare species, or it could be a process like interaction strength between a keystone species and other species in the system. For fisheries management there's a whole host of parameters you could decide on.
Then what you do is you figure out a target level of effectiveness, or the level of response that you're hoping to achieve.In other words, figure out what you're hoping to get. What I will argue is that the best response variable is the difference that you see between a marine protected area and a less-protected area (LPA). In that process you define the magnitude of the response that you expect to see; some tolerable limits with stated levels of confidence. "Tolerable limits" means that you're willing to accept a certain level of response, and defining the spatial extent and the temporal expectation--when and for how long--of that response. From this, the effectiveness of the MPA is the difference between the target level and the measured value, which can be plotted out as changes over time. You plot those values and you look at whether you're going to see a response approach a target level of your effectiveness within the temporal framework that you required, or within the lower effect of the tolerable effectiveness limit. So you measure effectiveness with respect to the effectiveness parameter relative to the target of effectiveness.Fundamental principles of measuring design effectiveness
The design requires four fundamental principles. The first one is the number of reserves that you have to work with. I argue that the more you have the better. The second is the timing of the monitoring relative to the establishment of the reserve. If you are fortunate enough and have a rare opportunity to start collecting data prior to establishment, you can employ a Bacchi design, especially if it's only one reserve you have to work with. Alternatively, and more likely, you're going to start monitoring at or after the time of establishment of a reserve, in which case you'll have to do these trend analyses with regression approaches to look at the response.
To recap, the effectiveness parameter has to be determined. It's got to be a response variable that you can look at a protection by time interaction in the trend analysis, or an MPA vs. LPA difference. Or you can use a general linear model approach, just a regression or analysis of variance when you're trying to look at the different covariates and factors that might be influencing the response that you're trying to determine or the different possibilities that are confounding the reserve effect.
Another thing is the scale of inference. Are you trying to just make inferences about differences inside and outside of these reserves, or are you trying to figure out how the whole system is responding to the presence of these things? The problem with only one reserve is that it's easily confounded spatially or temporally. It's very difficult to ever ascribe causality to the effects of a marine protected area when there's only one. It also means you have very limited inference. The best you can ever say is that particular reserve had that particular effect, but you can never say much more about the effect of MPAs in general. But there are some very good examples in New Zealand. That was a single little reserve where they were able to see pronounced effects that suggested the potential value of these things.
On the other hand, multiple reserves can remove or account for confounding variables, unlike the single reserve, because you can treat these as covariates and try and decipher the MPA effect separate from them, and you definitely get a broader inference about general reserve effects.
What about the timing of monitoring? The worst-case scenario is when a reserve has been established and then you start monitoring some time down the line. The reason this can be bad is if the MPA and the LPA show parallel trends and no divergence, and although the MPA has a higher response than the less protected area, you simply never know whether that's because that site has always been higher in that particular variable. The site could have been chosen as an MPA in the first place because it has a higher density of some species, for example. Or, there could have been an increase but you just didn't detect it.
A better scenario is when you can at least start monitoring at the time that you establish the reserve. In that case, for a given response variable that meets your objectives for that MPA such as density, size, or diversity of species, you can sample these through time. Especially if you have multiple reserves, you can ask whether there's a reserve-by-time interaction by examining if the slopes are different from one another. You can do that for a single reserve, but it's much more tenuous because you don't have replication and it's difficult to decouple from site effects.
The nice thing about looking at trends of differences over time is that whether you're starting with a pristine MPA that you hope is going to persist relative to a degrading response outside the reserve, or whether you're starting with an area that has been overfished already and measuring the response, the bottom line is that the delta in the response variable will be the same; the deltas are getting bigger and bigger over time. That can be tested with a regression analysis.
Scale of inference
The scale of inference is the kicker. I've been talking about this at the local level--making these comparisons inside and outside of the reserves to see if there's an effect within the reserve. But at the regional level we want to know how the whole area is responding to the establishment of reserves within it. This is far more problematic because it's likely to be a much slower response than the initial response you see within the reserve. You also have to decouple that you should have a concomitant reduction in fishing effort when you establish reserves to avoid this annoying idea that you're going to increase intensity of fishing in some particular area outside the reserve.
Conclusion
In conclusion, every reserve program needs to fund and develop an evaluation program. It seems to me that it is something that is just fundamental. If you're going to pass legislation to create them in the first place then it would be incredibly valuable to also pass legislation that allows you to evaluate their effectiveness. The response variables that you measure just can't be pulled out of the sea. They have to be linked to the objectives of that reserve.
Multiple MPAs and low-protected reference areas provide the most rigorous and informative assessment of effectiveness. And evaluation sampling needs to be initiated as early as possible; preferably before establishment of a reserve but at least as soon as it is established. Then, with all this evaluation, if all these things are designed and created in a way that they vary in certain criteria--like size or fishing effort --then you can compare the responses based on those different treatment levels and adapt how you design those based on their relative effectiveness. That's the whole idea behind adaptive management, which hopefully would expedite the evolution of a more effective design in the future as you continue to change and expand the program.