Technology and Learning Research (AARE)

Baking the Rasch Model Cake with Associate Professor Michael Carey

Various academics Season 1 Episode 2

Today we're chatting on a topic that's crucial for anyone in the field of education research or anyone curious about quality survey construction and robust survey scale validation. We're talking about Rasch model analysis with Associate Professor Michael Carey. He uses a cake baking metaphor to explain how the Rasch model technique is applied to survey design and validation in plain English. He also refers to some recently published STEM research as examples of how a large 77-item preservice teacher TPACK self-audit instrument was developed and validated: 


The development and validation of a self-audit survey instrument that evaluates preservice teachers’ confidence to use technologies to support student learning. Michael D. Carey, David A. Martin, and Natalie McMaster. 2024. International Journal of Research & Method in Education. https://doi.org/10.1080/1743727X.2024.2341714


Assessing primary school preservice teachers' confidence to apply their TPACK in specific categories of technologies using a self-audit survey. David A. Martin, Michael D. Carey, Natalie McMaster and Madeleine Clarkin. 2024. https://doi.org/10.1007/s13384-023-00669-x

Let us know your thoughts on this episode

Nat: Hi everybody. Welcome to our Technology and Learning podcast. I'm Nat McMaster, and I’m a member of the AARE Technology and Learning Special Interest group. Today, we're chatting on a topic that's crucial for anyone in the field of education research or anyone curious about data accuracy in surveys. We're talking about Rasch model analysis. Joining us today is Associate Professor Michael Carey, a seasoned expert in survey design and validation using Rasch analysis. Hi, Michael! It's fantastic to have you on the podcast.

Michael: Thanks for inviting me along.

Nat: So let’s start at the beginning. Rasch analysis—what is it, and why is it so important to  survey design? And I believe you have a cake baking metaphor you use to help explain it to newbies like myself.

Michael: I came up with this cake baking metaphor to help explain it to students and people just getting into it because it’s pretty daunting stuff to read about and do. It is easy to lose sight of why you are doing it unless you take the time to figure it out, and I think the metaphor helps to do this. So, in a nutshell, Rasch model analysis is a statistical technique named after the Danish mathematician, Georg Rasch; and it’s used to improve the quality and precision of surveys and also tests. I use a tool called Winsteps, created by Rasch model pioneer Mike Linacre. And I use Winsteps for bivariate analysis and another package called FACETS for multivariate data – so basically for simple cakes or complex cakes, essentially, depending on the complexity of your data.  

So think of Rasch model analysis as your baking guide for making the perfect survey "cake." It helps to ensure that each question or item on a survey, measures exactly what it's intended to measure, and  it does so reliably across different respondents​​. It’s a tool that helps to ensure that questions in a survey (or survey items) and the responses from people taking them, are reliable and truly reflect what they are supposed to measure. 

So, imagine you’re baking a cake, and you have a recipe that tells you exactly what ingredients you need, how much of each to use, and the steps to follow to ensure your cake turns out perfectly every time. The Rasch model is a recipe that can assist greatly when designing and evaluating surveys.

When designing surveys, we want to make sure that the questions we ask and the way people answer them give us clear, consistent, and meaningful information. The Rasch model provides a set of standards, or criteria, to help us check if our "baking" (in this case, our survey) is on point.

Nat: So Michael, w  hat are the key "ingredients" according to the Rasch model?

Michael: There are 5 ingredients:

1. Each survey item counts equally: Just like each ingredient in a cake has its place, each item in a survey should contribute fairly to the overall measurement. No single question should overpower the others. A well-constructed survey should contain a range of items that are easier or more difficult for respondents to endorse,  so that you can capture a range of responses depending on the participants’ abilities, confidence level or attitudes – whatever psychometric trait is being measured about the respondent.

2. Then t here has to be a predictable progression: So imagine if adding more flour to your cake, predictably made it fluffier up to a point. Similarly, as someone's ability or agreement with a concept increases, their likelihood of choosing more agreement or higher-level responses in a survey should also increase in a predictable way. The basic assumption of the Rasch model is that the likelihood of someone endorsing a statement (the item) is determined by two things: how capable, confident or knowledgeable that person is, and how challenging or complex the survey item is. The idea is that the greater the match between a person's ability, or confidence and the difficulty of endorsing the item, the more likely the person is to endorse the item. Essentially, it assumes there's one main trait or skill being measured across all items in the group of items, and each response tells us something about both the item and the person answering it.

Nat: Wow. So Rasch analysis allows you to see both the participants’ response to the item and the difficulty to answer that item as a measure on the one scale. That’s great.

Michael: Yeah, that’s exactly right Nat. Both metrics need to be viewed together. There is a graphic plot called a Wright map that does a beautiful job in showing exactly that.  The third thing is It has to be fit for purpose: Not all flour is the same for every cake; similarly, not all survey items fit well in every survey. The Rasch model helps to identify which questions fit our specific "baking" purpose and which might need to be replaced or altered. This is done with an analysis of item infit and outfit. I won’t get into that today. It is rather technical and requires considerable experience in kneading the dough.

Nat: So what are the last two ingredients,  according to the Rasch model?

Michael: Fourth It has to make sense across the board: Just like you would want your cake recipe to work in any oven, the responses to your survey should make sense across different groups of people. The model helps ensure that items are understood and interpreted similarly by everyone. This is done in another analysis called Person Infit and Outfit.

Nat: So the model checks for uniformity in how items are interpreted across respondents. Is there more to consider?

Michael: Yes. Finally, It has to deal with one thing at a time: If you are testing if a cake is sweet enough, you would not want the taste of chocolate to overpower the sugar. Similarly, the Rasch model checks that a survey measures one concept at a time clearly and accurately. Analyses of survey category structure and dimensionality take care of this.  

Nat: So Michael, how have you applied the Rasch model to validate the survey we have been using in our latest studies?

Michael: So for our listeners - We’ve just recently published an article in the International Journal of Research & Method in Education that will be available online now. Just today actually. And it’s called “The development and validation of a self-audit survey instrument that evaluates preservice teachers’ confidence to use technologies to support student learning”. 

So this 77-item self-audit scale has been applied to two studies on preservice technologies courses to-date. One published in The Australian Educational Researcher this year called “Assessing primary school preservice teachers' confidence to apply their TPACK in specific categories of technologies using a self-audit survey” with David Martin, yourself Nat, and Madeline Clark. Another one using the scale with a larger cohort is being submitted soon.

Nat: So how did you specifically use this method in our study?

Michael: In our study, we developed a self-audit survey to measure how confident preservice teachers felt about using various technologies in their teaching. Using Rasch analysis, we were able to refine our questions, remove those that didn't fit, and ensure our survey provided clear, actionable data on where these future teachers might need more support or training​​.

Nat: So that sounds incredibly valuable. Could you walk us through the process of applying Rasch analysis to a survey?

Michael: Certainly. It starts with developing a set of items or questions based on the construct you want to measure—in our case, preservice teacher confidence to use technologies to support student learning. Once these are administered, Rasch analysis evaluates each item based on responses, checking if they align well with the construct. It looks at aspects like the difficulty of endorsing each item (rating the items with a higher response – in our survey this was higher confidence) and whether all respondents understand and interpret the items consistently. It’s about ensuring that each ingredient—the survey questions—mixes well to create a consistent and meaningful outcome “cake”​​.

In our survey validation, four standard procedures of Rasch model analysis were used after that: 

First, I analysed the Person and Item Fit for the Whole Model:

In Rasch analysis applied to surveys the word “Person” is the label applied to the survey respondents and the word ‘items’ are the survey items that have been developed to investigate a construct such as ours: preservice teacher confidence to use technologies to support student learning. So think of this like you are checking if all the ingredients (or respondents) and steps (the items) in your recipe contribute to making the cake as expected. If an ingredient does not mix well or a step does not lead to the expected outcome (like if the batter isn't the right consistency), it is not fitting in with the recipe. In the whole model fit, you're ensuring that every person's response and every survey item works together harmoniously to create a "delicious" overall result.

Then,  I analysed the Fit for Each Item in the Survey within each of ten technologies components. Originally there were 10 components such as “general computer skills” and “Coding and robotics skills”, each consisting of ten specific technology skills such as “I can use keyboard shortcuts” for the General computer skills component or “I can program a robot to perform a complex task using sensors” to which the participants responded on a 4-point Likert-type confidence scale.

So, in cake baking terms this is like examining each ingredient individually to make sure it is fresh and appropriate for your cake. If the flour is lumpy or the eggs are too small, they won't work well for your recipe. Similarly, you check each survey item to make sure it measures what it is supposed to. If an item is too easy or too hard to endorse, or if people interpret it in different ways, or find it ambiguous, it might not be a good fit for your survey "cake." The Rasch model is very effective at finding issues with your survey items (the ingredients).

Nat: So, what do you do when you find issues with the survey items?

Michael: You simplify the recipe.

Imagine your original cake recipe calls for 20 ingredients, but after some testing, you realize that only 10 of them really enhance the flavour (or represent the construct). The fit analysis helps to simplify the survey design and make it more reliable by reducing the number of items to those that are most impactful, making your survey shorter and sweeter, just like a simplified recipe.

The fit analysis also measures the reliability of the items and the respondents (or Persons). Reliability is a measure of  consistent quality in the design.

Ensuring your cake tastes great every time you make it is like making sure a survey reliably measures the same things each time it’s given. The fit analysis helps identify and remove any unreliable items (like a bad batch of baking powder) that could lead to inconsistent results. From our 100 items in our original preservice teacher confidence to use technologies survey scale we removed 23 items. The fit analysis identified some ambiguously worded questions, some that were just too easy for respondents to endorse and others that were faulty in construction by being double-barrelled.

Nat: So, after removing faulty items, what’s next?

Michael: Then I analysed the survey Category Structure. We used a 4-point confidence scale, so it had no middle category. There is some literature that suggests, for good reason, why a middle category like “unsure or neutral” shouldn’t be used.  Sometimes you need to have an “unsure” category, for example if working with younger respondents who really are unsure about what the item is asking. The main reason to avoid it in survey design is that it can disorder or interfere with the ordered nature of categories in ordinal data, and it often creates a NULL throw away category – so wastes valuable ingredients. Or sometimes the category descriptors are just not well-ordered or named in a way that is meaningful for the respondents. And if this happens, you want to know about it.

So, imagine your cake recipe has different levels of sweetness depending on how much sugar you add. You want to make sure that each level, from 'just a hint' to 'super sweet,' makes sense and has a clear difference from the others. The category structure analysis ensures that each response option on your survey scale is distinct and ordered properly, just like the sugar levels. It checks that the steps from 'not sweet at all' to 'extremely sweet' are even and meaningful.

Finally, I looked at the survey’s Dimensionality Using Principal Components Analysis (PCA)

PCA is like making sure that your cake is really a cake (a single construct or trait like what we were measuring as a construct or latent trait - preservice teacher confidence to use technologies to support student learning), and not part cake, part pie. You want to confirm that all the parts of your recipe are coming together to create one coherent dessert (or contruct), not a mash-up of different desserts. In survey terms, you are checking that your survey items are all measuring the same underlying idea or trait—for the whole model that represents the construct or latent trait and for the more measurable and definable components and items below it like General computer skills—and “I can use keyboard shortcuts” not a mix of different traits.

Imagine when you're baking a cake, and you have a variety of ingredients on your kitchen counter, but you're not sure which ones will make your cake taste the best. Principal Component Analysis (PCA) in the context of survey development and design works a bit like figuring out which ingredients are essential to making your cake delicious and which ones might not make much difference or could even spoil the flavour.

Think of PCA as the process of tasting each ingredient to decide if it should go into your cake. Just as you identify key ingredients like flour, sugar, and eggs that are essential for the cake, PCA helps to identify the main themes or constructs a survey should focus on. This ensures that every item in the survey adds value, just like every ingredient in the cake serves a purpose.

When you find ingredients that complement each other well, like vanilla and sugar, you group them together in your recipe. Similarly, PCA groups survey items that measure similar things together, ensuring that each part of the survey effectively captures a specific aspect (or component) of what you’re trying to understand, like different flavours in a cake.

Nat: So what are some of the common pitfalls or challenges when using Rasch analysis?

Michael: One of the biggest challenges is ensuring that your initial set of items is well-constructed. Poorly formulated items can lead to misinterpretations and data that don't accurately reflect the respondents' true feelings or abilities. Additionally, Rasch analysis requires time and patience to understand statistical models, which can be a steep learning curve for many researchers – but it is well worth the investment​​.

At the end of the day, using the Rasch model and these various analyses helps to ensure that each part of the survey works together as expected to give you a clear, accurate picture of what is being measured. By following these standards, the Rasch model ensures that when we "bake" a survey, the result is just what we aimed for—clear, reliable, and meaningful information that helps us understand the trait being examined. This way, we can trust the insights gained from these measures and use them confidently to support our preservice teachers better. Rasch analysis has significantly improved the quality of instruments we use to measure educational outcomes. By ensuring that our tools are accurate and reliable, we can better understand and support preservice teachers' needs, tailor educational programs, and ultimately enhance teaching and learning through technology​​.

Nat: Fantastic. Thank you Michael for shedding light on such a complex yet crucial topic. And for me, the cake baking metaphor has been really helpful in understanding the Rasch model.

Michael: Thanks, Nat. I enjoyed baking with you. Happy baking.

Nat: Fantastic. Thanks Michael, bye for now.