That all these organizations came up with their own systems of rating, quality of evidence and strength of recommendations, most of which were not well thought through, it was chaos. These systems, nobody could understand them. If you understand one, you turn around and you’d be faced with another.
Hey, Brad.
Hey, Matt, good to see you on a Saturday afternoon.
Yeah, yeah. Good to see you. This is fun because we’re going to hear from Gordon Guyatt again today. Part two of the interview, Electric Boogaloo, as it were.
Part two.
Yeah. So more about evidence based medicine. When I first heard the interview, I was really undervaluing the importance of evidence based medicine. But is it right that there’s this article in the British Medical Journal that came out ranking it?
Yeah. So British Medical Journal is obviously quite prestigious and one of the top general medicine journals in the world in, I think it was in, 2007, they surveyed their readers and over eleven thousand people responded and they basically had an opportunity to vote on the top medical discoveries in the last hundred and fifty years. And evidence based medicine made the top 10. In fact, it was ranked as number seven. It came in ahead of the computer and diagnostic imaging. And if I recall, no one was clean water and sewage disposal which makes a lot of sense.
Yeah, but yeah, evidence based medicine did quite well.
Wow, that’s really incredible. You know, I sort of was thinking, oh, well, it was a really important event in just sort of research or statistics or medical research, but not just in sounds like in medicine in general. It was a big game-changer.
Yeah, I think up until EBM evidence based medicine became kind of a part of practice for many physicians, it was a lot of decision making for patients, and was probably based on just clinical experience. I’ve seen three patients like you and this intervention seems to work in two. So I’ll suggest that you take the same intervention EBM goes. It does take into account clinical experience, but it’s really driven by what the totality of evidence is for a particular clinical question or scenario.
And Dr. Guyatt, it will get into this. And in addition to the clinical experience and the best systematic review evidence, what are the values and preferences of the patient or the client in front of you?
Yeah. Now is that where is that where GRADE comes in.
Well, so GRADE. Good questions. So GRADE. So Doctor Guyatt is is the co-chair of the GRADE Working Group, which is essentially an international group that has developed the methods for looking at the certainty or the quality of evidence based on systematic reviews of the evidence for a particular research question or clinical scenario to look at what the certainty of evidence is on an outcome by outcome basis. So if you’re interested, for example, in whether vitamin D reduces the risk of a bone fracture, yeah, there’s randomised control trials.
There might be observational studies. You do a systematic review, the literature, you find out what is the best estimate of the risk reduction. And you’d also look at what the estimates are for potential harm from vitamin D. Maybe if you take too much of it might be harmful and then you come up with an estimate. So let’s say there is a 10 percent risk reduction for those that take vitamin D supplementation and there’s a confidence interval that surrounds that. So let’s say it ranges from anywhere from two percent risk reduction to upwards of a 15 percent risk reduction where GRADE comes in.
If it moves beyond what that risk reduction is. It tells you, well, how certain are we in that actual risk reduction? So it ranges from very low certainty to high certainty. So if you have high certainty, you’re kind of in a position where you can make a causal inference. If you have very low certainty, it means like, well, it might work or it might not work. It could be calcium. It could be the potassium.
It could be many other things. We’re not really sure about vitamin D, even though we had this estimate, that says a 10 percent risk reduction. It doesn’t mean that we have a lot of certainty in it.
OK, so you come up with a risk reduction and then you come up with how certain you are. It makes me think of my fifth grade. Is it a weird story? It makes me think of my fifth grade teacher, Mr Otte, who had a mild vendetta against the weatherman. And I don’t know if it was just like a bad weather man in that particular county or whatever, but he I remember really specifically he was telling us that they they had like a sixty five percent chance of being right.
So like, a weather man would come on the news, and would say there’s a 65 percent chance of rain, right, and he told us this and then he pointed to the gray board and he was like, sixty five percent is a D minus. It’s a D minus everytim. So, like, you know, I always think of him just like pointing at the board with his middle finger that had like a little piece missing and talking about like basically that weathermen got off easy.
There was no assessment of the certainty of it. It was just sort of like, here’s a no. It’s probably going to rain like maybe sixty five percent chance. Yeah.
And that’s a great analogy because I think a lot of people, we just stop with sixty five percent and the sixty five percent mean sixty five percent of the province of the state is going to be covered in rain. I mean, you know, there’s a whole bunch of things that go into coming up with that. No. Sure. Yeah. It’s really like you have it’s kind of like when it comes to evidence again let’s say vitamin D. Yeah.
It’s kind of like a three or four step process. Yeah. What is the best estimate. Step one. So that would be the pooled estimate based on systematic reviews ideally to what, you know, what’s the confidence interval or the range of that estimate. So does it give, give or take, let’s say five or 10 percent? Number three, what’s the certainty of that evidence? So, for example, the great approach now there’s alternatives to the great approach.
And I think we’ll get into that a little bit along the way. But yeah. And so those are those are kind of the three core steps. Unfortunately, a lot of people stop at step one. Sure. And Dr Guyatt, that also gets into essentially step four. So you have the best estimate for vitamin D, let’s say, in terms of the potential risk reduction. And let’s say you also have estimates for the potential increase in harm if you take too much vitamin D or take it for too long.
And he gets into and it’s really interesting, he gets into, OK, so you have the benefits and you have the harms. But then how do you actually move to a recommendation for a patient or a client or for the public? If you were making guideline recommendations, dietary guideline recommendations?
That’s fascinating to me. I sort of thought the GRADE was more about like looking at the totality of evidence and ranking it. This seems like a much more complicated and and I love this phrase that you use quite a bit the totality of evidence. Right. It also it sounds kind of epic, which is why I like it. But you’ve got all of these studies and they’re ranked by importance as well to be as accurate as you can about that figure.
And then, you know, how certain are you about that figure? But then also, I hadn’t thought at all about values and preferences, like, is it even worth it? Certainty of evidence, you know, might be a factor there if you’re going to go spend thirty five bucks on an umbrella real quickly before you walk to work.
Yeah. So really, this this concept of values and preferences has us moved from that certainty of evidence for a particular outcome to making a decision. It’s interesting. Sixty five percent. That’s pretty high right now. Obviously, everyone would take their umbrella, especially if they had one. Yeah, of course. You know, in epidemiology or clinical epidemiology or nutritional epidemiology, we don’t we rarely see estimates that are above, let’s say, 30 percent and 30 percent is a big estimate.
Like if you have a 30 percent absolute risk reduction, that’s yeah, that’s that’s that’s a big deal by epidemiological standards. So we’re usually working with numbers like anywhere from, let’s say, one or two percent. Absolute risk reductions to, let’s say, maybe upwards of a thirty or forty percent absolute risk reduction. But that’s rare. And so, you know, if the chance of rain is 30 percent or 20 percent or 10 percent, whether to take an umbrella or not probably would depend on having more information around certainty and what you know, what that chance of rain actually means.
Yeah, that makes sense. That makes a lot of sense. You know, the theme of this podcast could be complexity, right? There’s so there’s so much data to go through. There’s so many different interpretations potentially based on sort of the filter as far as how you interpret it, whether it’s truly evidence based medicine, whether it’s your as you were saying, your anecdotal clinical experience, which sounds like is a little more what they used to do before there were guidelines for the medical literature.
Yeah, there’s a lot of moving kind of pieces to good decision making to evidence based decision making. Yeah, there’s more moving pieces in nutrition than there is. And let’s say general medicine were. There’s evidence for a drug versus a placebo for reducing the risk of bone fracture, but nutrition, it’s harder to have a higher degree of certainty given how given the fact that we often have just observational data and given the fact that vitamin D is one of many different vitamins or minerals that work synergistically.
Yeah, I think it’s a really good point to say to talk about nutrition and sort of a different light from a kind of a research perspective, just because, as you say, you know, it’s so synergistic, everything is so interwoven. And I and I think to me that means all the more attention should be spent on the the totality of evidence and the quality of evidence.
Yeah. And this episode, Matt, it’s so just to reiterate so we’ll get more history on evidence based medicine like what it is and how it kind of came to be, which is kind of fun. Yeah. He’s going to talk about, Doctor Guyatt’s going to talk about, the history of guidelines and how they how it used to be. They used to be basically just the you know, typically a bunch of older, probably white men sitting around a table.
Yeah. Coming up with what they thought was best for their patients and then writing those recommendations under the auspices of some organization. And he talks a bit about the evolution of making guideline recommendations and then the evolution of GRADE, how you kind of started with the this idea of critical appraisal and then move to kind of evidence based medicine. And then and then GRADE came along and basically was a refinement of evidence based medicine through a growing international group of methodologies who came up with methods to look at the certainty of evidence from systematic reviews and then to move from the systematic review evidence base to making guideline recommendations for clients, patients, members of the public, and where values and preferences come in.
In making those recommendations. So that’s kind of what Dr. Guyatt talks about.
Yeah, he’s such an interesting guy to listen to. And of course, it’s a really interesting topic. I’m really glad that we could have this content on the podcast and we hope that you guys enjoy this. The second part of the interview from Dr. Guatty.
30 to 40 years ago now, people started to produce guidelines, and so for our audience, guidelines are are based on the ideally the best systematic reviews of the questions of the Pekoe, questions of interest. And then they kind of move from that evidence base to making recommendations for clinicians or members of the public, their formal structured recommendations before you had groups of largely self-appointed experts who would write review articles telling people what to do. And then organizations started to make formal guidelines.
So you would have the leading American endocrin society would make up guidelines and the American Thoracic Society and the gastroenterologist and the American College of Surgeons.
They all make sure they all start making guidelines. And although you provided a approach that captures some of what we now think of trustworthiness, the initial approaches of all of these organizations were almost always what we now call gobsat and gobsat is good old boys sitting around the table. So it was the experts in the field, many of whom were receiving large amounts of money from the makers of the drugs that they were recommending.
They were typically older white males, and they’d be sitting around the table together making these formal recommendations. But it was still a change. It was now formal recommendations endorsed by particular societies, as opposed to just experts writing these negative reviews that people would pick up 30 years ago or so.
People started to get the idea that maybe we should be thinking about the quality of the evidence behind our recommendations and maybe we should start thinking they were stronger recommendations. The result was that all these organizations came up with their own systems of rating, quality of evidence and strength of recommendations, most of which were not well thought through. It was chaos. These systems, nobody could understand them. If you understood one, you turn around and you’d be faced with another.
So then so now we’re so we’re kind of hopefully we’re moving from how evidence based medicine kind of came to be and then realizing how to make these principles to optimize the utility for clinical practice. And now we’re moving into GRADE. And you’re kind of cofounding of that group, which was about twenty five years ago, the a guy who trained with me faster, then went back to Norway. It was his idea, his name was Andy Oxton. And Andy and I continue to work together after he left the pastor are still good buddies.
I was visiting Norway when Andy introduced the idea to me in the summer of nineteen ninety nine summers in Norway. The sun basically never goes down.
And so that puts you in a frame of mind where you are liable to sit and discuss or argue or most of the night with a cherished colleague, which Andy and I did, about this notion of how could we optimally rate quality of evidence and go from evidence to recommendations. That was the genesis of the group that ultimately became GRADE first met, in the year 2000. And in 2004, published in the BMJ, the first publication laying out the great system.
And what was so Dr. Shunemann was also and is a prime mover in GRADE
He yes. We involved a number of people from the beginning, including Holger, but he decided his passion was in systematic review methodology and particularly guideline methodology. And so he very much took to GRADE and GRADE has been the center of his career subsequently.
And he was the first chair of what became the GRADE working group to turn grave was an idea that Holger came up with. To describe to give us a name in 2008 Holger and I, Andy decided he didn’t want to share the group anymore. And Holber and I took over as co-chairs., I guese, what GRADE decided when we started meeting in 2000, that probably three times a year until 2004 when we first published the paper and went through probably hundreds of examples. One of the big contributions is we identified that risk of bias was the only problem that could be evidence less trustworthy.
And we decided that inconsistent results. Some studies show, in fact, some studies don’t lie, even if they were all low risk of bias studies. You had a problem with inconsistency. And we decided that if the sample sizes of the available studies were all small and yet had confidence intervals. Imprecision was a problem. And we decided that if the evidence, if there’s a patient before you. So, for instance, in my practice as an internist, I have lots of folks over 90 in the randomized trials that might guide my practice.
There were very few enrollees over 90.
And the question arises whether the results in younger patients can be applied to older patients. And so we have issues of what GRADE has called indirectness of the evidence and finding what the qualification was. So the bottom line is one of the big contributions of GRADE was to identify these categories of limitations and evidence where, yes, the focus at the beginning with Dave Sacket was risk of bias. It became clear that one could define what we have classified as five categories of limitations that can lower quality or certainty,and a very important distinction to make in moving from risk of bias to the trustworthy of evidence or the quality of evidence is quality of evidence is about the body of literature for a particular research question.
So based on systematic reviews, we need to look at risk of bias among studies, indirectness among studies, imprecision among studies, publication bias, well,
imprecision about the pool, estimates of all the studies, not about the individual studies or the consistency across the study.
So you’re absolutely right. It’s based on a body of evidence and ideally a systematic review and if appropriate as it often is, and then analysis that allows the most sophisticated evaluation of the evidence. Right.
Versus when you’re looking at a single study, let’s say a single randomized trial or a single cohort study, you’re just looking at risk of bias or the validity of that study, then saying, OK, well, what is the size of the estimate? And it is applicable to my to my patient in front of me, for example, or to the to the population.
Yes, the two things. First of all, you’re not only looking at risk bias, you’re also looking at precision with respect to the individual study. You may also be looking at indirectness if the study is about young people in your patients, over 90 theres still indirectness issues in individual studies, and you shouldn’t be using the individual study anyway. One of the fundamental principles of EBM is we need systematic summaries of the best evidence. And if you think any individual study may well be misguided.
Right. So it’s really interesting how Dave Sackets kind of critical appraisal or critical appraisal at the bedside, those concepts then kind of get developed more as you come along and work with him and others and then they kind of creep in or kind of bleed into GRADE and additional concepts when it comes to the body of evidence.
So so it’s an evolution of the thinking. But I started to look at it and I said this is not optimally or a this is not optimally organized. And second, it puts insufficient emphasis on the whole process of going from the evidence to the management decisions with the patients. When I realized that. That’s where the user’s guides to the medical literature was built, because now we had three different elements. We said, OK, what is the focus on the risk of bias then on what are the results where issues of precision came in?
And then how do you apply the results to the patient, which is where issues of directness come in?
And then in those three concepts are things that we evaluate when we use GRADE.
So so I think I don’t know, the first use guys published in 1993 by 2000 when GRADE is now starting to percolate. We published, as you mentioned, I think twenty five papers in the user’s guides by that time, but that moves the process along. We published our first user’s guide about guidelines in 1995. At that time one of the great things about doing the user’s guides for me was I was the editor of the series and when somebody wanted to do on about guidelines, I had to learn about guidelines.
1995 I didn’t know about guidelines so I thought now we have to write a user’s guide and guidelines better learn about guidelines. Right. And it was only at that time what became fundamental so everybody thinks of it as evidence and focus on the evidence. But as it turns out, it is equally focused on people’s values and preferences because it was what what happened was, OK, so we’re starting to teach people, OK, let’s use the literature. It’s not a reader’s guide anymore.
It’s a user’s got moments that need. It’s people who are using the literature to guide their practice.
So that’s a fundamental shift from the reeaders guides to a users guide. That’s right.
And so now we start doing this and now we say, OK, let’s look at the results and see how we use them in practice. That seems utterly obvious now, but it wasn’t then. But we start to look back. There’s these benefits and theres these harms and we have our certainty of quality of the evidence and the benefits and the harms and their magnitude. Now, what do we do? How do you weigh these benefits and harms? You’re going to have to make a decision.
You have to say, do the benefits outweigh the harms or do the harms outweigh the benefits or is it a close call? And how do you make those decisions? And then a new light bulb goes off. It has to do with values and preferences.
And how do you weigh these benefits and harms its values and preferences? And then you start thinking whose values and preferences? And it quickly became evident to us. And I think to most folks, it’s the values should be the values and preferences of the patients. And then I think it was in 2000, we said now there are three principles of evidence based medicine. We need systematic reviews of the best evidence.
We need rules or guides to decide what’s more trustworthy, less trustworthy evidence. And the ironic third principle of evidence based medicine evidence never tells us what to do ever. It’s always evidence in the context of values and preferences. And 20 years after we first said that people still mischaracterize EBM and somehow they’ve missed 20 years of vivid writing on our partevidence-based, saying values and preferences are core to evidence based practice.
Thanks for listening, if you’d like to hear more episodes of Methodology Matters, a podcast on evidence based nutrition, please head over to MethodologyMatters.Being.Com.
If you’d like more information on the GRADE working group. Please visit GradeWorkingGroup.org. That’s Gradeworkinggroup.org.
And if you’d like to learn more about Dr. Guyatt and his work, you can find him at ClarityResearch.ca or on Wikipedia.
Thanks for tuning in. We’ll see you on the next episode of Methodology Matters.
Comments