If you haven’t already please read the summary posting as an introduction. This is the third of four postings on the MPAC study and covers MPAC’s Study 2. My second posting, covering Study 1, is here. And my fourth posting, covering the Lansink critique, is here.
Details of Study #2
I fear that this part will be a difficult one for most people to follow, not to mention being lengthy. Feel free to skip it. But I think it is important to document what this Study contains, and MPAC made no effort to make understanding it easier. I recommend you print out Study 2’s 5 pages (pdf pages 26 to 30) and have them at hand as you read this.
The purpose of Study 2 is to “study the effect of proximity to industrial wind turbines on residential sale prices.” In summary, Study 2 finds that “With the exceptions noted above, no distance variables entered any regression equations for any of the other market areas.” Say what?
It seems that people who are in the business of estimating real estate prices tend to fall into one of two camps. First are those who make their living providing services to the people who actually own the properties, with real estate brokers being the most obvious examples. These people tend to focus on one property at a time and generally use comps or repeat sales to obtain their estimates. Second are those who make their living providing services to people who don’t actually own the property. Academics and mass appraisers (like MPAC) are the most obvious examples. These people tend to focus on many properties at a time and generally use statistical techniques like multiple regression analysis to obtain their estimates. The second class tends to think in terms of rejecting the null hypothesis – you assume there is no difference between two sets (in this case close-in prices and far-away prices) unless you have “statistical significance”. As a snarky aside, getting to statistical significance in real estate can be quite a challenge, given the wide variance among prices, and can be even more difficult when your sponsor/boss doesn’t want you to do so.
So of course MPAC used their main tool, regression equations that run multiple regression analyses. They created three new variables based on distance from an IWT and entered these into regression equations to see if the new variables were statistically significant. If they aren’t statistically significant they don’t “enter” into the regression equations. As for the exceptions (which we’ll get to shortly), out of 30 possibly significant variables, only 4 were significant and 3 of them were positive! Whew!
So right off the bat MPAC is using a tool that doesn’t provide the answers the actual owners of potentially affected properties really care about. A binary statistical significance indicator does not provide an answer to the “how much” and “how likely” questions a homeowner is going to have. In this case, MPAC has skipped through the study so opaquely that I can’t even have much confidence in my critique. There’s just too many omissions, too many unexplained leaps, too many dangling statements.
There are just 5 pages in Study 2. The first of these (page 25 of the study) lists the three new distance variables and sets their criteria for statistical significance at either 5% or 10%. For those unfamiliar with that concept, the significance is a measure of the odds two populations are in fact just randomly part of the same larger population. In this case, a 5% significance means that there is only a 5% chance that the prices of the close-in homes are the same as the far-away home prices. In other words, there’s a 95% chance that the close-in prices are different from the far-away prices. What if there’s only an 80% chance your home value will drop? Not significant, from MPAC’s perspective.
The second page (page 26) is dominated by Table 9. For MPAC’s purposes Ontario is divided into 130 “market areas”. These areas presumably have some common basis that allows them to be treated as a unit for their regression equations. Unfortunately I couldn’t find where the areas were or how many homes were in each. Of the 130 MPAC found 15 that had large enough turbines in them to be of interest. These 15 are listed in Table 9, along with the numbers of sales within each of the 3 distance variables for both pre-construction and post-construction. MPAC didn’t bother adding them up either horizontally or in total, but I did. The numbers inside the grid add up to 3136, which would be the total sales within 5 km in all the areas. But if you add up their numbers along the bottom you come up with 3143. It turns out that their 142 should be 139 and their 1584 should be 1580. Now this isn’t much of an error, except that any pre-teen with a spreadsheet and 10 minutes wouldn’t have made it.
At the bottom of page 26 they introduce pre-construction and post-construction periods, and that only two of the 15 have enough sales to test both distances and periods. Most of the remaining 13 have “sufficient sales within 1 KM to test the value impact within that distance”. Also that the “sales period to develop valuation ranges from December 2008 to December 2011”. And that Table 10 provides a summary.
The third page (page 27) is dominated by Table 10. It lists the remaining 10 market areas that presumably have “sufficient sales within 1 KM to test the value impact within that distance”. 2 of these have enough sales to test both distance and periods while the other 8 have enough sales to test just the distance. For each of the 10 areas MPAC list square footage etc and median adjusted prices. Are these the prices for the entire area or just within 1 km? MPAC doesn’t say. What is the criterion for “sufficient”? MPAC doesn’t say. Nor does MPAC include what should obviously be included – both tables. I suspect they are for the entire area, in which case they are useless for our purposes, at least without the close-in comparison.
Presuming the criteria for inclusion into Table 10 is the 1 km test mentioned on page 26, one has to wonder how 26RR010 and 31RR010 got into it, as Table 9 shows they had zero sales within 1 km. Snark alert – maybe the missing 7 sales from Table 9 took place in these areas? And if 1 km isn’t the criterion, what is? MPAC never says.
At the bottom of page 27 they mention that some sales at the 5 km distance were in urban as opposed to rural market areas and thus were eliminated. They don’t say how many, nor what their effects on the regressions might be. They also reiterate their statistical significance levels.
On the fourth page (page 28) they present two more tables, 11 and 12. Table 11 lists the 8 market areas that had sufficient sales (within 1 km?) to test the distance variables while Table 12 lists the 2 market areas that had sufficient sales to test both distance and periods. These tables made absolutely no sense to me until I noticed Appendix F.
For all 10 areas they entered the 3 distances and ran their regressions. In Appendix F they list all the “excluded” variables, in this case all the distance-related variables that didn’t get to statistical significance. They apparently are called “excluded” since, being “insignificant” they don’t enter into MPAC’s final pricing calculations. If you look at the “sig” column you will not see any value less than .100, or the 10% significance level MPAC mentioned on pages 25 and 27. I assume by omission (and that’s all I can do here) that any of the 3 distance variables that are NOT listed in Appendix F are in fact significant.
On my first pass through Appendix F I came up with 6 omitted, and thus assumed significant, variables. Two of the omissions were for zero sales, for areas that shouldn’t even be there by the <1 km criterion. But, maybe the < 1 km variable was never even entered on the exclusion listing in Appendix F, so maybe I had erroneously assumed it was not excluded when in fact it didn’t exist in the first place. So maybe the criterion for inclusion in Table 10 wasn’t significant sales less than 1 km, but rather significant sales less than 5 km out. Just a typo, right? At least Table 11 now is consistent with Tables 9 and 10.
Finally! Out of the 30 tests (10 areas times 3 tests) I count 4 that are significant. Those 4 make up the “non-DNE” entries in Tables 11. MPAC provided absolutely no guidance or explanation about any of this, apparently writing for a very small audience.
Table 12 shows the 2 areas that had enough sales to test both distance and periods. You’d think that they’d be creating 6 variables for each of them instead of the 3 variables the other 8 areas received. Looking at Appendix F all you see is the same 3 as everyone else got. And all of those variables were excluded. But Table 12 shows 2 of the variables being significant for 26RR010. Perhaps Appendix F was based on a 5% significance level and Table 12 was based on 10%. Who knows?
I can only guess that the dollar amounts in Tables 11 and 12 are the effects of being in those areas upon the prices. So, in the Kingston area (05RR030), if you live within 1 km of an IWT, you can expect the value of your home to increase by $36,435! Very impressive – 5 digit accuracy, especially with a sample size of 7.
Finally, thank goodness, we come to the fifth page (page 29). It is the Summary of Findings and contains more words than the rest of the Study put together. This section mostly lists the significant variables and adds some fairly cryptic commentary.
Some Commentary
As I read through and dissected this Study I couldn’t escape the sense that MPAC didn’t want to put much effort into it. Any narrative or explanations or even public-friendly conclusions are absent. The tables that are included are ok, once you take the time to figure them out, but what about all the stuff they should have included but didn’t? Things like the median prices in the areas represented by the 30 variables. Or an Appendix F1 that shows the included variables, allowing us to see the t-scores etc for ourselves. Etc., etc.
These missing items cause this Study to be terribly opaque. I hope my explanation above is accurate, but I can’t be sure due to all the missing items. Maybe the Study reaches valid conclusions, but I sure can’t verify that. Perhaps MPAC thinks we should just trust them to be an honest pursuer of the truth. Sorry, that no longer flies, if it ever did. You have to wonder, is there some reason other than laziness or stinginess that this Study seems so empty? In addition to the opacity the Study includes several cryptic items that MPAC never explains. For example, from the summary, what do these sentences actually mean?
“Upon review of the sales database, it was determined that the IWT variables created for this study were highly correlated with the neighbourhood locational identifier. This strong correlation resulted in coefficients that did not make appraisal sense, and thus have been negated for the purposes of this study.”
If you look at the excluded variables in Appendix F you notice that most of them are named “NBxxxx”. Probably those are neighborhood identifiers the somehow overlay the market areas. MPAC never mentions how many there are or what the criteria are for forming one. But pretty obviously the areas around an IWT could easily coincide with their neighborhoods. So what gets negated? Some of the coefficients? All of them? MPAC provides no further information.
As an aside, I found it interesting to scan over the other excluded variables to see what sorts of things MPAC puts into their regressions. Many of them make no sense and they seem to vary greatly from market to market. I can’t help but think of a bunch of regression-heads sitting at their desks hurriedly making up variables and desperately running regressions in an effort to get the ASRs closer to one (ASRs are covered in Study 1).
I’ll leave (thankfully, believe me) this Study behind with the final thought that it seems so slapped together, so opaque, so disjointed that perhaps even MPAC themselves weren’t sure what significance it holds. Unfortunately, the wind industry won’t care about any of that, and will use this study to continue harming Ontario residents.