A Peek Behind the Curtain – Part II: Test Validity
Dependability is something we all look for in our everyday lives. For example, think of your favorite restaurant. You go back to this restaurant again and again because the food and service are consistently good. Let’s say you have a real craving for steak, so you order a nice filet mignon. It comes out and it’s everything you had hoped it would be. The next time you have a craving for steak, you want to come back to the restaurant and have the same experience. You get exactly what you want time after time. We have that same type of mentality in our test development here at eHarmony Labs. When we create a new survey or assessment, we want to make sure we are getting exactly what we wanted, in that it measures the trait we wanted to assess, and that we are consistently getting the results that we want.
There are two aspects that make a test dependable: reliability and validity. Reliability, as shown in this previous blog, shows that all of the questions are measuring the same underlying trait. Validity shows how well the questionnaire measures the target trait it was supposed to measure. Let’s think of test dependability in terms of throwing darts again. If you are aiming for the bull’s-eye and hit the bulls-eye with all three darts, your throws would be considered both valid and reliable (and we should play doubles). In terms of a psychological test, all of the items are consistently measuring the same trait and the underlying trait is what the test is designed to measure, or what we were aiming for.
An important aspect of dependability is the relationship between the reliability and validity. A test has to be reliable in order to be valid. If a test was developed to assess the amount of extraversion in a person, every single item would have to be assessing extraversion for the test to be considered valid. However, it is possible to have a reliable test without it being valid. For instance, if our extraversion assessment were getting consistent results, but after examining the test’s validity it is actually measuring another trait like agreeableness, then the test would be reliable, but not valid. It would be measuring the wrong thing! The big question is how can we test if our questionnaires are really measuring what we want them to measure?
Let’s imagine we have created a new questionnaire that is designed to assess how satisfied a person is in their current relationship. We might include questions assessing things like how close they feel to their partner, their degrees of affection, and how well they resolve conflicts. A simple way of determining if this new test is measuring relationship satisfaction would be comparing it to other well-established tests that also measure relationship satisfaction, like the Dyadic Adjustment Scale (DAS) (Spanier 1976) or the Couples Satisfaction Index (CSI) (Funk & Rogge 2007). We would expect the new test scores to correlate highly with these well-established test scores. The more people answer this new questionnaire in the same way they answer established questionnaires, the more our new measure is valid. On the flip side of that, we would also expect our new relationship satisfaction test not to be highly correlated with a test measuring another trait, like self-esteem, although they may show small similarities. From examining these comparisons, we can now state that our new relationship satisfaction assessment is statistically valid, in that it is measuring what it is supposed to.
What would happen if we didn’t test for validity in our assessments? To put it simply: chaos. All of our further research and ultimately matching models really depend on these basic concepts. Without testing for validity, our matching system would not be effective! Because we make sure our assessments are measuring the traits they are supposed to, we know that the findings of our studies that use these scales are unbiased and true to the nature of the underlying trait, and we can build the matching system according to these findings that will help lead you to finding your ideal match and having a successful relationship of your own.
Carefully examining the dependability of each test used at eHarmony Labs is extremely important as you can see. Just as you want dependability in the restaurants you frequent, we want dependability in our measures of underlying psychological traits. If you order your filet mignon, and you’re brought a plate of ground beef, I’m sure you would notice quite a difference! While it is harder to get a dependable measure of psychological traits, but using these simple tests of validity and reliability, we can ensure we are consistently and correctly assessing each person’s personality, values, and interests.
Read more from Jonathan Beber at eHarmony Labs.
If this article gave you the confidence to find your match, try eHarmony today!Join Now