Posts Tagged ‘decisiontree’
Just a quick thought. It just occurred to me that some of the questions on the USMLE involving pedigree analysis in genetics, are actually typical decision tree questions. The probability that a certain individual, A, has a given disease (eg: Huntington’s disease) purely by random chance is simply the disease’s prevalence in the general population. But what if you considered the following questions:
- How much genetic code do A and B share if they are third cousins?
- If you suddenly knew that B has Huntington’s disease, what is the new probability for A?
- What is the disease probability for A‘s children, given how much genetic code they share with B?
When I’d initially written about decision trees, it did not at all occur to me at the time how this stuff was so familiar to me already!
Apply a little Bayesian strategy to these questions and your mind is suddenly filled with all kinds of probability questions ripe for decision tree analysis:
- If the genetic test I utilize to detect Huntington’s disease has a false-positive rate x and a false-negative rate y, now what is the probability for A?
- If the pre-test likelihood is m and the post-test likelihood is n, now what is the probability for A?
I find it truly amazing how so many geneticists and genetic counselors accomplish such complex calculations using decision trees without even realizing it! Don’t you :-) ?
Copyright © Firas MR. All rights reserved.
The Monty Hall Paradox
One of the 3 doors hides a car. The other two hide a goat each. In search of a new car, the player picks a door, say 1. The game host then opens one of the other doors, say 3, to reveal a goat and offers to let the player pick door 2 instead of door 1. Is there an advantage if the the player decides to switch? (Courtesy: Wikipedia)
Hola amigos! Yes, I’m back! It’s been eons and I’m sure many of you may have been wondering why I was MIA. Let’s just say it was academia as usual.
This post is unique as it’s probably the first where I’ve actually learned something from contributors and feedback. A very critical audience and pure awesome discussion. The main thrust was going to be an analysis of the question, “If you had to pick an answer in an MCQ randomly, does changing your answer alter the probabilities to success?” and it was my hope to use decision trees to attack the question. I first learned about decision trees and decision analysis in Dr. Harvey Motulsky’s great book, “Intuitive Biostatistics“. I do highly recommend his book. As I pondered over the question, I drew a decision tree that I extrapolated from his book. Thanks to initial feedback from BrownSandokan (my venerable computer scientist friend from yore :P) and Dr. Motulsky himself, who was so kind as to write back to just a random reader, it turned out that my diagram was wrong and so was the original analysis. The problem with the original tree (that I’m going to maintain for other readers to see and reflect on here) was that the tree in the book is specifically for a math (or rather logic) problem called the Monty Hall Paradox. You can read more about it here. As you can see, the Monty Hall Paradox is a special kind of unequal conditional probability problem, in which knowing something for sure, influences the probabilities of your guesstimates. It’s a very interesting problem, and has bewildered thousands of people, me included. When it was originally circulated in a popular magazine, “nearly 1000 PhDs” (cf. Wikipedia) wrote back to say that the solution put forth was wrong, prompting numerous psychoanalytical studies to understand human behavior. A decision tree for such a problem is conceptually different from a decision tree for our question and so my original analysis was incorrect.
So what the heck are decision trees anyway? They are basically conceptual tools that help you make the right decisions given a couple of known probabilities. You draw a line to represent a decision, and explicitly label it with a corresponding probability. To find the final probability for a number of decisions (or lines) in sequence, you multiply or add their individual probabilities. It takes skill and a critical mind to build a correct tree, as I learned. But once you have a tree in front of you, its easier to see the whole picture.
Let’s just ignore decision trees completely for the moment and think in the usual sense. How good an idea is it to change an answer on an MCQ exam such as the USMLE? The Kaplan lecture notes will tell you that your chances of being correct are better off if you don’t. Let’s analyze this. If every question has 1 correct option and 4 incorrect options (the total number of options being 5), then any single try on a random choice gives you a probability of 20% for the correct choice and 80% for the incorrect choice. The odds are higher that on any given attempt, you’ll get the answer wrong. If your choice was correct the first time, it still doesn’t change these basic odds. You are still likely to pick the incorrect choice 80% of the time. Borrowing from the concept of “regression towards the mean” (repeated measurements of something, yield values closer to said thing’s mean), we can apply the same reasoning to this problem. Since the outcomes in question are categorical (binomial to be exact), the measure of central tendency used is the Mode (defined as the most commonly or frequently occurring thing in a series). In a categorical series – cat, dog, dog, dog, cat – the mode is ‘dog’. Since the Mode in this case happens to be the category “incorrect”, if you pick a random answer and repeat this multiple times, you are more likely to pick an incorrect answer! See, it all make sense :) ! It’s not voodoo after all :D !
Coming back to decision analysis, just as there’s a way to prove the solution to the Monty Hall Paradox using decision trees, there’s also a way to prove our point on the MCQ problem using decision trees. While I study to polish my understanding of decision trees, building them for either of these problems will be a work in progress. And when I’ve figured it all out, I’ll put them up here. A decision tree for the Monty Hall Paradox can be accessed here.
To end this post, I’m going to complicate our main question a little bit and leave it out in the void. What if on your initial attempt you have no idea which of the answers is correct or incorrect but on your second attempt, your mind suddenly focuses on a structure flaw in one or more of the options? Assuming that an option with a structure flaw can’t be correct, wouldn’t this be akin to Monty showing the goat? One possible structure flaw, could be an option that doesn’t make grammatical sense when combined with the stem of the question. Does that mean you should switch? Leave your comments below!
Hope you’ve found this post interesting. Adios for now!
Copyright © Firas MR. All rights reserved.
Readability grades for this post:
Flesch reading ease score: 72.4
Automated readability index: 7.8
Flesch-Kincaid grade level: 7.3
Coleman-Liau index: 8.5
Gunning fog index: 11.4
SMOG index: 10.7
Powered by ScribeFire.