4.55.05.56.06.57.07.58.08.59.09.510.010.50.00.10.20.30.40.50.60.70.80.91.0$960$980$1.0k$1.0k$1.0k$1.1k$1.1k$1.1k$1.1k$1.1k$1.2k$1.2k$1.2k$1.2k$1.2k$1.3k$100m$200m$300m$400m$500m$600m$700m$800m$900m$1.0Years of ExperienceSalary

An Introduction to Hierarchical Modeling

This visual explanation introduces the statistical concept of Hierarchical Modeling, also known as Mixed Effects Modeling or by these other terms. This is an approach for modeling nested data. Keep reading to learn how to translate an understanding of your data into a hierarchical model specification.

Nested Data

You'll frequently encounter nested data structures when doing analytical work. These are instances in which each observation is a member of a group, and you believe that group membership has an important effect on your outcome of interest. As we walk through this explanation, we'll consider this example

Estimating faculty salaries, where the faculty work in different departments.
As you could imagine, the group (department) that a faculty member belongs to could determine their salary in different ways. In this example, we'll consider faculty who work in the InformaticsEnglishSociologyBiology, and Statisticsdepartments.

A Linear Approach

Let's imagine that you're trying to estimate faculty salaries based on their number of years of experience. A simple linear model could be used to estimate this relationship:

y^=β0+β1x1+...+βnxn\hat{y} = \beta_0 + \beta_1x_1 + ... + \beta_nx_n

In the above equation, you would estimate the parameters (beta values) for your variables of interest. These are known as the fixed effects because they are constant (fixed) for each individual. In our case, we would simply use years of experience to predict salary:

salaryi^=β0+β1experiencei\hat{salary_i} = \beta_0 + \beta_1 * experience_i

While this provides some information about the observed relationship, it is clear that there is variation in salary by department. The methods introduced below allow us to capture that variation in different ways.

Random Intercepts

It may be the case that each department  has a different starting salary for their faculty members, while the annual rate at which salaries increase is consistent across the university. If we believe this to be the case, we would want to allow the intercept to vary by group. We could describe a mixed effects  model that allows intercepts to vary by group:

yi^=αj[i]+βxi\hat{y_i} = \alpha_{j[i]} + \beta x_i

In the above equation, the vector of fixed effects  (constant slopes) is represented by  β , while the set of random intercepts  is captured by α. So, individualiin departmentjwould have the following salary:

salaryi^=β0j[i]+β1experiencei\hat{salary_i} = \beta_{0j[i]} + \beta_1 * experience_i

This strategy allows us to capture variation in the starting salary of our faculty. However, there may be additional information we want to incorporate into our model.

Random Slopes

Alternatively, we could imagine that faculty salaries increase at different rates  depending on the department. We could incorporate this idea into a statistical model by allowing the slope  to vary, rather than the intercept. We could formalize this with the following notation:

yi^=β0+βj[i]xi\hat{y_i} = \beta_0 + \beta_{j[i]}x_i

Here, the intercept (β0) is constant(fixed) for all individuals, but the slope (βj) varies depending on the department (j) of an individual (i.). So, individual iin department jwould have the following salary:

salaryi^=β0+β1j[i]experiencei\hat{salary_i} = \beta_0 + \beta_{1j[i]} * experience_i

While this strategy allows us to capture variation in the change in salary, it is clearly a poor fit for the data. We can, however, describe group-level variation in both slope and intercept for a better fitting model.

Random Slopes + Intercepts

It's reasonable to imagine that the most realistic situation is a combination of the scenarios described above:

Faculty salaries start at different levels and  increase at different rates depending on their department.

To incorporate both of these realities into our model, we want both the slope and the intercept to vary depending on the department of the faculty member. We can describe this with the following notation:

yi^=αj[i]+βj[i]xi\hat{y_i} = \alpha_{j[i]} + \beta_{j[i]}x_i

Thus, the starting salary  for faculty member i  depends on their department (αj[i]), and their annual raise also varies by department (βj[i].):

salaryi^=β0j[i]+β1j[i]experiencei\hat{salary_i} = \beta_{0j[i]} + \beta_{1j[i]} * experience_i

In order to implement any of these methods, you'll need to have a strong understanding of the phenomenon you're modeling, and how that is captured in the data. And, of course, you'll need to assess the performance of your models (not described here).

About

This project was built by Michael Freeman, a faculty member at the University of Washington Information School.

All code for this project is on GitHub, including the script to create the data and run regressions (done inR). Feel free to issue a pull request for improvements, and if you like it, share it on Twitter. Layout inspired by Tony Chu.