Data linkage and income in Australia

Data integration maximises the value of public data by combining data from two or more sources together. Data integration can help policy makers and researchers gain a much better understanding of Australian families, communities, industry, and the economy and improve the development and delivery of government services in areas such as health, education, infrastructure, and other community services.

Data integration is more efficient and less costly than undertaking new surveys, and can help to reduce the burden on Australian households and businesses to provide data that is already available.

In this blog, Associate Professor Nicholas Biddle, Associate Director, ANU Centre for Social Research and Methods, discusses some important insights from analysing socio-demographic and household information alongside a person’s income. 

Australia has a unique social security system that targets benefits to the poor to a much greater extent than most other OECD countries, was affected much less by the Great Recession than other countries, and has a number of population groups that have a unique policy focus and interest, including the Australian Aboriginal and Torres Strait Islander population.

Geographically, Australia is a country with a large land mass, but a sparsely distributed population living mainly in urban areas (particularly the large State/Territory capital cities) on or near the coast. All of this means that the determinants of income in Australia and recent inequality dynamics are likely to vary from other countries, as is the policy response to this distribution of income.

Historically, our understanding of the distribution of income in Australia has been based on four main sources: Repeated cross-sectional personal income tax records; longitudinal household surveys; repeated cross-sectional household surveys; and the five-yearly Census. All of these sources of data have produced remarkable insights. However, they have their own individual limitations and there is much that we do not know.

Until recently, there was no data in Australia with a large sample of individuals that includes socio-demographic and household information alongside a person’s income, as well as how that person’s income is changing through time. On November 14th, A/Prof Biddle gave a presentation as part of the ANU Centre for Social Research and Methods seminar series that summarised a validation and exploratory analysis of a dataset that met these criteria.

The opening up of access to data from the Multi-Agency Data Integration Project Basic Longitudinal Extract (BLE2011) provides an opportunity to create new knowledge in important aspects of income in Australia. This database has information from across four data sources linked at the individual level:

  • The Medicare Enrolments Database (MEDB) and Medicare Benefits Schedule (MBS) data;
  • Personal Income Tax (PIT) data;
  • Social Security and Related Information (SSRI) data; and
  • the 2011 Census of Population and Housing (Census),

Some important insights emerged from this initial analysis.

A/Prof Biddle showed that our understanding of income and who is at the very top of the income distribution is strongly influenced by whether we look at the household or the individual level. Around one-fifth of people who were in the top 1% of the income distribution based on their own income, were not at the top of the income distribution based on their household income.

Those with Extremely High incomes (the top 1%) are much more likely to obtain their income from sources other than wages and salaries. Less than half of their income comes from wages and salaries, much lower than those with high incomes (the 81st to 98th percentile) or middle-high incomes (the 51st to 80th percentile) for whom over 70% of their income comes from wages and salaries.

The largest occupation category for those at the very top of the income distribution are Professionals. There are slightly fewer Managers, but combined these two occupation categories make up the vast majority of those with extremely high incomes. The top five individual occupations in the top 2 per cent of the income distribution are: CEOs and Managing Directors; GPs and Resident Medical Officers; Accountants; Solicitors; and Advertising, Public Relations and Sales Managers.

Our narrative around the top of the income distribution is that they are CEOs. One of the benefits of using the BLE2011 is that the large samples and linkage to Census data allows us to look at this in detail. In reality, the very top of the household income distribution is as likely to be accountants, lawyers, and doctors as it is CEOs and other managers.

There is some instability at the top of the income distribution. If you were in the top 1% of the income distribution in one financial year, then there is a roughly 2/3 chance that you will still be at the top of the income distribution in the subsequent year. Most people don’t drop down very far though.

Falling out of the top of the income distribution isn’t evenly distributed. You are more likely to be in that 1/3 of the top 1% who move out of the top of the income distribution if you are older; male; Indigenous; not employed; have low education; were born in Australia or arrived before the 80s; or do not speak English well. There are likely to be other characteristics that the BLE2011 reveals.

Ultimately, we get a richer picture of socioeconomic status if we are able to combine information from multiple data sources. This can be across different studies, or by linking datasets at the individual level. There are both methodological and privacy challenges with such data linkage. We should keep these in mind when using such data, validate against other data sources, and make sure that we are open and transparent about how the data is accessed and the stringent safety requirements built into data access. However, good public policy and government accountability requires us to make use of all the information that we have.

DISCLAIMER: We want to foster discussion on data issues that are relevant to all Australians, by collaborating with experts in the private, civil society, research and government sectors to publish their posts and linking to their articles on our website. The views expressed in the posts and comments are not necessarily those of the Office of the National Data Commissioner. We encourage you to add your views in the comments section of the post.


This article has 0 comments