Introduction
The consequences of the house price collapse and subsequent financial crisis in the United States were severe and obvious. It is of vital importance that steps are taken to prevent another episode from happening. The role of regulation in the crisis is a matter of debate. However, most would agree that financial institutions require some degree of regulation, and that accurate information in regulatory filings is important to good regulatory oversight and the efficient functioning of financial markets.
Most mortgage lenders are required to file information on their mortgage applications under the HMDA. The government then compiles the data and reports to the public whether the information for each mortgage application contains detected errors. This provides an opportunity to evaluate whether lenders with accuracy problems in their filings were more prone to engage in activities that were costly to society. A number of lenders were bailed out under TARP, which was costly to the United States,[2] and associated with unsound lending practices. This article explores what factors predict whether lenders will have a high incidence of detected errors in their HMDA filings, and whether these lenders were more likely to be bailed out under TARP.
Why Errors Occur
There are a number of reasons why inaccurate regulatory filings can indicate problematic lending practices. One reason is that deliberate data falsification is causing the inaccuracies. While it is not easy to detect falsified data because the perpetrators attempt to hide what they are doing, there have been papers that have found indications of data falsification on the part of the lender or the borrower. For instance, Garmaise found that reported borrower asset amounts had a higher frequency just above multiples of $100,000 than just below.[3] DiLellio and Forsyth found that some reported incomes had high frequencies compared to others, suggesting that income numbers had been selected for reporting.[4] These “clusters” of reported incomes were higher when the loans did not have to meet the more rigorous paperwork requirements of Fannie Mae and Freddie Mac. DiLellio and Forsyth also find that a number of indicators associated with potentially dishonest reporting were related to higher reported income on mortgage applications, compared with typical area income.[5] Piskorski, Seru, and Witkin found evidence that some investment banks misrepresented information reported to purchasers of residential mortgage backed securities, including occupancy status and presence of second liens.[6] These papers look at whether reported numbers are “suspicious” in some way. We take a new approach by looking directly at regulatory filing errors that were detected by the government, and our results have additional implications because the government was aware of these mistakes.
There are other reasons besides income falsification for why financial institutions may have inaccurate regulatory filings. While these reasons may not involve deliberate misbehavior by lenders, they are nonetheless problematic, and worthy of regulatory concern. One possibility is that the employees have low skills, possibly due to poor education or training. There are also cultural reasons. Management may have a sloppy attitude towards accuracy, and this attitude could filter down to employees, including those responsible for making loans and those reporting to regulators. This sloppiness could also allow borrowers to misstate income. Another possibility is that management is dishonest and focused on enriching themselves. While they may not explicitly tell employees to falsify data, they may give employees strong monetary incentives to do so and look the other way when dishonest practices become prevalent. Forsyth and Harjoto found that employees had pay that was more tightly tied to mortgage volume in those states with the biggest house price run-ups,[7] and Forsyth and Harjoto found that employees with stronger incentives to underwrite mortgages were more likely to transfer mortgages out of the bank.[8]
Finally, lenders that make small loans face a bigger reporting burden in comparison to the dollar amount of loans they make. An extreme example would be payday lenders, who make loans for amounts as small as $100. If they were required to have extensive reporting on each loan they would go out of business due to overhead costs. Therefore, we expect lenders that make smaller loans to have more inaccuracies in filings, due to stronger pressure from the expense of those filings.
Lender Characteristics, Filing Mistakes, and Bail-Outs
In the first stage of our analysis we examine which lender characteristics are associated with regulatory filing mistakes in the HMDA data. These characteristics are of interest for a number of reasons. For instance, they can point regulators to which types of lenders need additional inspection, they can shape policy regarding allowing lenders to grow in size or become geographically dispersed, and they can create indicators for which lenders may be involved in income falsification, thereby deserving more scrutiny.
In the second stage of our analysis we examine whether lenders with a high number of government-detected regulatory filing errors were more likely to get bailed out under TARP. In doing so, we verify that HMDA mistakes were not of a “trivial nature,” if for no other reason than they indicate something problematic about the lender. Furthermore, Alan Greenspan admitted that he did not think that it was necessary to carefully regulate banks.[9] His belief was that banks would not make bad loans because it was not in their self-interest. The second stage of this analysis measures how one aspect of that mistaken belief about regulation lead to costly consequences.
Predicting Reporting Errors
a. The Data and Hypotheses
For each mortgage application, the HMDA data contains information on a lender’s regulator, the reported income of the borrower, the loan amount, the census tract of the property, and the rate spread on the interest rate for high-interest rate loans under the Home Ownership and Equity Protection Act (HOEPA.) (For a more detailed description of the data set, see DiLellio and Forsyth, 2013.[10]) The HMDA data also contain a variable indicating whether the mortgage information contained a “validity edit failure” or “quality edit failure.” We collected area adjusted gross income (AGI) from the IRS for comparison purposes.[11] Finally, we collected an indicator for whether the lender primarily made subprime loans.[12] We calculated values for lenders for each region they were in, because the accuracy of a lender could vary depending on the location. A region was defined as a metropolitan statistical area.[13]
Table 1 lists summary statistics for the data. We expect that different regulators will have different levels of strictness regarding regulatory filings and include the lender’s regulator. This variable allows for an assessment of which regulators were more exacting. A larger lender, with more total applications, is hypothesized to have fewer coding errors because they can spread overhead costs over more loans, and will therefore be more willing to incur those overhead costs in the form of accurate regulatory reporting. Lenders that are spread out over more areas (census tracts) are hypothesized to have a harder time imposing systems that aid accuracy. This can be due to bigger coordination problems across different loan offices, and uneven practices and regulations across different areas.[14] As already discussed, lenders with smaller loans are expected to have more errors due to overhead costs. To the extent that subprime lenders, that charge higher interest rates, try to disguise the riskiness of their loans, we expect their reporting accuracy to be lower, although they may receive extra regulatory scrutiny, forcing them to be more accurate.
Table 1
Sample Summary Statistics
There are 137,336 observations. An observation corresponds to a lender/region in 2006.
Variable | Mean | Standard Deviation |
Office Of The Comptroller of the Currency Indicator (OCC) | 0.139 | 0.346 |
Federal Reserve Indicator (FRS) | 0.076 | 0.265 |
Federal Deposit Insurance Corporation Indicator (FDIC) | 0.107 | 0.309 |
Office of Thrift Supervision Indicator (OTS) | 0.132 | 0.339 |
National Credit Union Administration Indicator (NCUA) | 0.074 | 0.262 |
Housing And Urban Development(State Regulated) Indicator (HUD) | 0.471 | 0.499 |
Total Lender Applications | 65.0 | 349.0 |
Total U.S. Lender Census Tracts | 5,899.5 | 9,999.4 |
Average Loan Amount ($000) | 169.1 | 115.7 |
Average AGI ($000) | 56.8 | 43.7 |
Subprime Lender Indicator | 0.151 | 0.358 |
Average HOEPA Rate Spread (%) | 0.763 | 1.365 |
Average Reported Income/Area AGI | 1.645 | 0.995 |
Average Loan Amount/Area AGI | 3.148 | 1.908 |
Fraction Coding Errors | 0.311 | 0.429 |
Lender/Region With 50 Percent Coding Errors or Greater Indicator | 0.290 | 0.454 |
Lenders/Region With 100 Percent Coding Errors or Greater Indicator | 0.265 | 0.442 |
Lender/Region TARP Bailout Indicator | 0.081 | 0.273 |
Jiang, Nelson, and Vytlacil found evidence that loans with high reported income compared to area AGI were more likely to have falsified income.[15] Holding all other loan characteristics constant, borrowers with higher reported income are expected to be less likely to default, but they found the opposite. Borrowers with high reported income compared to area income were more likely to default. Simply put, their results confirm that if a lender is reporting borrower income that is out of line with their area, it is more likely that reported income is falsified.
Similarly, if a mortgage has a high loan amount compared to area AGI, this can increase the likelihood that reported income has been falsified in order to qualify for the loan. Alternatively, it may indicate that the lender is more willing to take on risk in the form of larger loan amounts, which can lead to more defaults. In this case, the variable should have a similar measured effect as the subprime indicator or HOEPA rate spread variable, which also measure the willingness of a lender to take on risk in the form of higher default probabilities.
Strikingly, the number of reporting errors is quite large. On average, 31.1 percent of reported mortgages had a reporting error in at least one variable. In 29.0 percent of cases, 50 percent or more of the mortgages reported for a lender in a region had an error. Even more remarkable, 26.5 percent of lenders had an error in every single mortgage reported in a region. Figure 1 sheds additional light on the incidence of errors. Lenders in the lowest 45th percentile of errors had no, or insignificant errors. However, the incidence of errors rises rapidly after that. Lenders in the top 75th percentile of errors had at least one error on every reported mortgage.
Figure 1
HMDA Filing Errors
b. Regression Analysis
Table 2 shows regression analysis that predicts whether a lender has coding errors of 50 percent or more on their reported mortgages in a region, and 100 percent or more on their reported mortgages. As expected, lenders that are spread out over more census tracts have more errors, while lenders that have a high volume of loans in a region have fewer.
Table 2
Predicted Coding Errors
These are OLS regressions where the fraction of coding errors for a lender/region is the dependent variable. State indicators are included but not reported. Cluster t-statistics are in parentheses. “***”, “**”, and “*”, denote significance at 1%, 5%, and 10% level respectively. Clusters are by lender. There are 137,336 observations.
IndependentVariable | 50%+Coding Errors | 100%Coding Errors |
OCC | -0.1072 | -0.1162 |
(-2.24)** | (-2.40)** | |
FRS | -0.3113 | -0.3318 |
(-6.44)*** | (-6.72)*** | |
FDIC | -0.0463 | -0.0572 |
(-1.25) | (-1.52) | |
OTS | 0.0112 | 0.0045 |
(0.22) | (0.08) | |
HUD | -0.5226 | -0.5618 |
(-14.80)*** | (-15.82)*** | |
Ln(Lender # of Census Tracts) | 0.0328 | 0.0342 |
(5.92)*** | (6.15)*** | |
Ln(Lender # of Regional Applications) | -0.0606 | -0.0585 |
(-18.67)*** | (-17.60)*** | |
Ln(Average Lender Regional AGI) | 0.1642 | 0.1621 |
(8.59)*** | (8.54)*** | |
Lender Subprime Lender Indicator | -0.0204 | -0.0128 |
(-0.83) | (-0.53) | |
Average Lender Regional HOEPA Rate Spread | 0.0010 | -0.0012 |
(0.17) | (-0.20) | |
Ln(Average Lender Regional Loan Amount) | -0.0362 | -0.0433 |
(-1.68)* | (-2.02)** | |
Average Lender Regional Reported Loan Income/Area AGI | 0.0085 | 0.0075 |
(2.36)** | (2.25)** | |
Average Lender Regional Reported Loan Amount/Area AGI | 0.0131 | 0.0138 |
(3.83)*** | (4.11)*** | |
Adjusted R2 | .32 | .38 |
One surprising result is that higher area AGI predicts more errors. If area income is an indicator of the education level of employees, the sign should be negative. That is, more educated employees should make fewer errors. Also, riskier loans during the housing run-up tended to be made in lower-income areas that presumably had less educated employees. As can be seen, subprime lenders, with high rate spreads, did not have a significant relationship with errors. This suggests that they also received more regulatory scrutiny, which would offset their tendency to have inaccurate regulatory filings.
Lenders with smaller loan amounts had more errors, with 10 percent significance, for 50 percent or more coding errors, and 5 percent significance for 100 percent coding errors, as predicted. As expected, lenders with higher reported income compared to area income have higher errors, with 5 percent significance. This indicates that income falsification may be related to greater inaccuracy in regulatory filings, especially given the previously discussed findings of Jiang et al. with respect to this variable.[16] Similarly, lenders with large loan amounts compared to area income also have higher coding errors, with 1 percent significance. Since the subprime indicator and HOEPA rate spread variable, which measure risk-taking, are insignificant, this suggests that the loan amount variable is indicating possible income falsification, rather than risk-taking since it has a measured outcome that is closer to that of the income variable.
Predicting TARP Bailouts
Referring back to Table 1, 8.1 percent of all lenders in a typical region received a TARP bailout.[17] Our main hypothesis is that lenders with coding errors were more likely to get bailed out. However, it is important to include other variables in the analysis that can predict a TARP bailout.[18]
a. Factors Other Than Coding Errors that Predict a TARP Bailout
Table 3 shows which factors predict whether a lender was bailed out as part of TARP. (The data is averaged over all regions for each lender.) Specification (1) has the fraction of regions with 50 percent or more errors for a lender as an explanatory variable. Specification (2) uses 100 percent errors. Lenders that are spread out over more census tracts are more likely to get bailed out. This is not surprising because they are potentially represented in more congressional districts. (For a paper on how politicians influence mortgage market regulation, see Mian, Sufi, and Trebbi, 2009.)[19] Also, under The Community Reinvestment Act lenders that wished to merge across state lines were required to show that they were serving community needs, and may have felt more pressure to accept risky loans and misrepresent them. The more mortgage applications a lender received, the less likely they were to receive a bailout, with 5 percent significance. This is consistent with larger lenders being more diversified, and having economies of scale. Lenders that were in higher income areas were more likely to get bailed out. This may indicate a stronger political influence of high income areas. Subprime lenders, with higher rate spreads, had insignificant coefficients.
Table 3
Predicted Lender Bailouts
These are probit MLEs where an indicator that a lender received a TARP bailout is the dependent variable. The fraction of regions for a lender in each state are included but not reported. Wald Chi-Square values are in parentheses. “***”, “**”, and “*”, denote significance at 1%, 5%, and 10% level respectively. There are 8,178 observations.
IndependentVariable | (1) | (2) |
OCC | 0.6026 | 0.6040 |
(33.15)*** | (33.26)*** | |
FRS | 0.8438 | 0.8493 |
(53.47)*** | (53.91)*** | |
FDIC | 0.8244 | 0.8280 |
(77.32)*** | (77.86)*** | |
OTS | 0.4127 | 0.4177 |
(11.83)*** | (12.13)*** | |
HUD | -0.4624 | -0.4569 |
(9.12)*** | (8.68)*** | |
Ln(Lender # of Census Tracts) | 0.2526 | 0.2546 |
(12.02)*** | (12.22)*** | |
Ln(Lender # of National Applications) | -0.1256 | -0.1288 |
(3.98)** | (4.19)** | |
Average Ln(Lender Regional AGI) | 0.3977 | 0.4039 |
(17.33)*** | (17.92)*** | |
Lender Subprime Lender Indicator | -0.3745 | -0.3800 |
(2.27) | (2.33) | |
Average Lender HOEPA Rate Spread | -0.0196 | -0.0180 |
(0.48) | (0.40) | |
Fraction 50%+ Coding Error Regions | 0.3027 | |
(10.80)*** | ||
Fraction 100% Coding Error Regions | 0.2857 | |
(8.78)*** | ||
Rescaled R2 | .17 | .17 |
b. Coding Errors and TARP Bailouts
We now turn to our variable of interest. Looking at specification (1), the more regions a lender had with 50 percent or more errors, the more likely they were to be bailed out, with 1 percent significance. This is true for 100 percent errors in specification (2) as well. If a lender had 50 percent or more errors in every region, this would increase their chance of getting bailed out by 30.27 percent. With 100 percent errors, the chance increases by a similar magnitude.
Conclusion
There are a number of potential reasons why financial institutions have errors in regulatory filings. Reasons include poor employee education or training, a culture of sloppiness fostered by management, or pressure from overhead costs. Errors can also occur because employees have powerful incentives to make loans, with inadequate internal controls to prevent misreporting, or because of outright falsification. All of these reasons can be associated with poor quality loans. Indeed, we find evidence that suggests that purposeful falsification of data may be associated with HMDA filing errors for at least some loans.
The cost to society of ignoring inaccurate, and error prone filings can be large. We find that lenders that had high levels of errors in their HMDA filings were also significantly more likely to get bailed out under TARP. Apparently Alan Greenspan’s lack of concern for regulation led to serious consequences.[20]
[1] We thank ProPublica for providing data on bank bailouts. We wish to thank the Graziadio School of Business and Management for funding from Funds for Excellence and the Julian Virtue Award. We would like to thank Miryam Farzam for excellent research assistance.
[2] Not all lenders paid back their TARP money, and the need for TARP money imposed costs on society because it reduced confidence in the financial institution, reduced confidence in the economy, and created monetary costs associated with the bailout.
[3] Garmaise, M. (2014). “Borrower misrepresentation and loan performance.” Journal of Finance, Forthcoming.
[4] DiLellio, J., & Forsyth, J. (2014): “Government-sponsored enterprises and income falsification on mortgage applications.” International Journal of Business, Accounting, and Finance, 8 (1) 34-48.
[5] DiLellio, J. & Forsyth, J. (2013). “Income falsification on mortgage applications during the housing bubble.” Working paper. Retrieved from http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2323102.
[6] Piskorski, T., Seru, A., & Witkin, J. (2013). “Asset quality misrepresentation by financial intermediaries: Evidence from RMBS market.” Working Paper. Retrieved from Social Science Research Network, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2215422.
[7] Forsyth, J. & Harjoto, A. (2011). “Examining bank employee compensation and residential mortgage loan volume at the state level.” International Journal of Business, Accounting, and Finance, 4 (1), 18-32.
[8] Forsyth, J. & Harjoto, A. (2009). “The impact of mortgage loans transferred on bank employee compensation.” Global Business & Finance Review, 14 (1), 77-85.
[9] http://economix.blogs.nytimes.com/2008/10/23/greenspans-mea-culpa/, accessed April, 2013.
[10] DiLellio, 2013.
[11] The IRS data was collected by zip code, which had to be matched to the census tract in the HMDA data.
[12] http://www.huduser.org/portal/datasets/manu.html. The last time this indicator was available was the end of 2005, and we assumed our lenders in 2006 had the same status.
[13] For rural census tracts outside of a metropolitan statistical area, within the same state we put them into a “rural” region.
[14] This would actually be an argument for larger lenders that are more concentrated.
[15] Jiang, W., Nelson, A., & Vytlacil, E. (2010). “Liar’s loan? Effects of loan origination channel and loan sale on delinquency.” Indiana University Research Paper, 2009-06-02.
[16] Ibid.
[17] ProPublica collected data on lender bailouts as part of TARP. See http://projects.propublica.org/bailout/list/simple.
[18] The other factors could be correlated with coding errors. Therefore they should be included so that coding errors do not pick up the effect of other variables.
[19] Mian, A., Sufi, A., & Trebbi, F. (2010). “The Political Economy of the U.S. Mortgage Default Crisis.” American Economic Review, 100 (5) 1967-98.
[20] Leonhardt, D. “Greenspan’s Mea Culpa.” Economix, Explaining the Science of Everyday Life, Retrieved from http://economix.blogs.nytimes.com/2008/10/23/greenspans-mea-culpa/?_php=true&_type=blogs&_php=true&_type=blogs&_r=1& 10/23/2008.