Bringing it all together

The first thing to do when using the LBD or IDI for analysis, is to bring together various items of data, organised by a small number of key variables, like the firms and years.

Of course, nothing in life is simple, so sometimes even these key variables are missing and so we have to see if we can piece them together some other way.

The basic units of observation in the LBD are the enterprise (enterprise_nbr) and the financial year (dim_year_key). Sub annual data are annualised to the financial year of the firm, which is then allocated to the 31st March year-end that has the greatest overlap with it. Information from the plant (PBN) is aggregated up to the enterprise level. Data on individual staff, shareholders etc. at a firm are taken from individual level data and aggregated up to the PBN or enterprise.

How do you fix a broken key?

Because the data are not created for researchers to play with, but rather for specific purposes, like tax filing, sometimes it does not behave as one would want it. A particular example of this is the enterprise identifier that is sits on the Business Register. This is based on legal entities, not economic ones. When a sole proprietor decides to incorporate their business, while continuing to employ the same staff in the same location to produce the same goods and services, this business may be represented in the LBD as two firms – one exiting, one entering. We might argue that there is in fact only one ongoing firm, it just changes legal status. Fabling, R. (2011) 'Keeping it Together: Tracking Firms in New Zealand’s Longitudinal Business Database' provides a simple method for repairing broken firm IDs, making use of existing plant migration data from work Stats NZ has done maintaining true longitudinal plant-level IDs.

Whether it’s important to use PENTs in place of enterprise numbers depends on the task at hand.  If your interest is in creating aggregates or examining cross-sectional statistics, the shift to PENTS may not add much to your analysis.  But using PENTS becomes important if you're interested in business dynamics - PENT repairs reduce the apparent firm exit rate in the data from 15.8% to 14.3% (Fabling 2011) - or if you want to track individual firms over time, as is common for causal analysis of business performance.

The Employer Monthly Schedule (EMS) provides the key link between individual information (identified by the snz_uid variable) to that of the enterprise (enterprise_nbr). The EMS is provided by employers to Inland Revenue to provide information on their employees. As such, it is indexed by IR’s employer and employee unique identifiers, snz_employer_ird_uid and snz_ird_uid, respectively. EMS employee data can then be linked to the wide range of other individual data in the IDI through a concordance between the snz_ird_uid (in the EMS) and the snz_uid used elsewhere in the IDI.

Source: Fabling, R. and D. C. Mare (2015a). 'Addressing the absence of hours information in linked employer-employee data'. Working Paper 15-17, Motu Economic and Public Policy Research.