Issue: October 10, 2010

ByJohn B. Pinto

October 10, 2010

10 min read

Save

This article is more than 5 years old. Information may no longer be current.

Methodology aims to evaluate cataract surgical outcomes fairly, simply

The Twin Bridges protocol was devised to measure and help improve cataract surgeon skill.

Issue: October 10, 2010

ByJohn B. Pinto

John B. Pinto

All eye surgeons are strivers and achievers. After years of striving to be the best students in high school, then to be among the top few graduates from university, and then to hold their own in medical school and postgraduate training, it seems appropriate that the most striving and confident surgeons would be clamoring for a cataract report card.

And yet, so far as we have been able to determine, no consensus has developed around the best approach to measuring cataract outcomes or surgical performance compared to peers. Interestingly, a search of the Internet reveals many fewer U.S. citations than international ones.

The U.S. has some of the finest medical training programs in the world and likely graduates the planet’s best-prepared eye surgeons. That said, compared to their peers, half of U.S. cataract surgeons perform below the median.

In this column, I describe the results of a modest pilot study comparing the peer-to-peer outcomes of a small cohort of cataract surgeons and propose a simple “surgeon scoring” methodology for the profession to adopt and hopefully improve upon over time.

This study resulted from efforts over several years, working in collaboration with selected client practices and an advisory group consisting of Drs. Dick Lindstrom, Lisa and Amir Arbisser, Paul Koch, and Trevor Elmquist to develop a simple cataract outcomes screening tool.

The “Twin Bridges” name for this project comes from an Old Wild West town in the middle of Montana where most of the advisory group first started discussing this project during annual client fly-fishing sessions.

We believe there will be several potential applications of the Twin Bridges protocol once it is fully developed beyond our early efforts. The protocol could be used to:

Augment a practice’s ongoing quality assurance activities
Screen surgeon job candidates
Screen out associate or partner candidates who are not meeting internally agreed-upon surgical outcomes standards
Act as part of the vetting process for merger-consolidation activity
Provide doctors nearing retirement with objective evidence that it may be time to discontinue operating
Provide surgeons with objective skills verification for their personal development
Validate new surgical maneuvers, equipment or implants
Establish the basis and starting baseline for any needed peer-to-peer surgical performance coaching
Use favorable data in a practice’s patient communication to give candidates for surgery confidence that their doctor is objectively better than average
Use favorable data to verify the quality of care with payers and professional referral sources

That said, everyone who has been involved with this project understands that significant controversy could arise out of this initial, exploratory effort, which may be why similar efforts to grade surgical outcomes and skills have been so scant in the past. The potential controversies with our efforts are twofold.

In its present format, as you will see, the Twin Bridges methodology is simple and highly reductionist, nothing more than averaging just two metrics: the percentage of a surgeon’s cases resulting in the patient’s 1-month postoperative spherical equivalent being within 1 D or less from the preoperative planned spherical equivalent and the percentage of cases achieving 20/40 or better Snellen best corrected visual acuity at 1 week postop. (A third measure recommended by the advisory group, the percentage of patients not needing a vitrectomy, was originally employed but skewed nearly all scores sharply upward. The vitrectomy rate in this 34-surgeon, 1,700-case cohort was just one-tenth of a percent.)

The benefit of the approach we used is simplicity and the ready availability of this data in nearly every surgical patient’s chart. As a result, adding surgeons to the data pool will be easy as our exploration of this metric continues. However, by being this reductionist, we are probably shaving points from some surgeons and unduly adding points to others. In particular, it would be fair to criticize the Twin Bridges protocol for only assessing the outcomes of uncomplicated cases; the protocol does not measure a surgeon’s mastery in handling complicated cases.

Our methodology

Fifty-case cataract data sets were received from 37 surgeons; of these, 34 data forms were completed sufficiently to be used in this study. All data were masked as to surgeon or practice of origin. The cases submitted were limited to the following parameters for qualification:

Cases selected were those in which pre- and postoperative vision were not compromised by corneal, vitreous or macular pathology (eg, corneal dystrophy, macular degeneration, visually significant diabetic retinopathy, epiretinal membrane, etc.).
All patients with a diagnosis of glaucoma were omitted.
Cases selected were all orthodox, senile cataracts in patients 55 years of age and older, not coded as complex cases (66982) and not implanted with a premium/multifocal or toric IOL.
The only cases submitted were those in which the postoperative care was provided by the practice’s internal providers; co-managed cases in which the postop care was provided by an outside or referring doctor were omitted.

It was decided that an ideal scoring system would be a simple 100-point scale (much like the familiar Parker points for wine, perhaps revealing too much about what we are doing in Montana besides fly fishing).

From the raw data, we generated each surgeon’s score as follows:

An “A” value was calculated by measuring the percentage of 50 cases with a spherical equivalent at or under 1 D from the preop plan at 1 month postop.
A “B” value was calculated by measuring the percentage of 50 cases with BCVA at or better than 20/40 at 1 week postop.
The total Twin Bridges score is simply the average of these two percentile values.

Here are the preliminary results from our study.

Surgeon age

The 34 surgeons in this cohort ranged in age from 37 to 61 years. The average age was 52 years. It is not unusual today for eye surgeons to remain active until their early 70s or later.

As a side point, this is in line with the distribution curve of eye surgeons in the U.S., who, like the baby boom generation, are clustered within a decade or so of retirement. This plus the quickly rising demand for ophthalmic services and the reduction of residency slots are leading us into an era in which the country will be underserved. For the typical practice, this will lead to benefits such as more patients to see, but also to more difficulty with surgeon recruiting and real challenges to practice succession planning. (See Figure 1)

Distribution of surgical times

Thirty-one surgeons supplied intraoperative times. Most reporting practices provided a finite time in minutes for each case; a couple of practices submitted an average time. The average surgical time was 11.5 minutes. The fastest two surgeons averaged 6 minutes and the slowest surgeon, 23 minutes. The average time frames are in line with observed contemporary goals to be generating three or more cases per hour in the operating room. (See Figure 2.)

Case volumes

This group was wide-ranging in monthly volumes, from 15 to 240 cases. The group average was 66 cases per month. As a cohort, even considering median figures, this group could be characterized as more experienced that the typical U.S. eye surgeon. When this study is expanded, we may see somewhat lower Twin Bridges scores with less-seasoned doctors. Interestingly, as evident in the data below, high monthly volumes do not necessarily result in the highest scores when applying the Twin Bridges protocol. (See Figure 3.)

Range and distribution of scores

Scores ranged from 86 to 100 points. The average score was 95.6 points. One of the questions we have is whether this is too charitable a scoring approach and whether it might be better to raise the bar such that average scores are a bit lower. It is fun to think about this in terms of the Parker system for wine, a scale of 50 to 100 points:

96 to 100: an extraordinary wine of profound and complex character displaying all the attributes expected of a classic wine of its variety.
90 to 95: an outstanding wine of exceptional complexity and character.
80 to 89: a barely above-average to very good wine displaying various degrees of finesse and flavor as well as character with no noticeable flaws.
70 to 79: an average wine with little distinction except that it is soundly made. In essence, a straightforward, innocuous wine.
60 to 69: a below-average wine containing noticeable deficiencies, such as excessive acidity and/or tannins, an absence of flavor, or possibly dirty aromas or flavors.
50 to 59: a wine deemed unacceptable. (See Figure 4.)

Surgical assertiveness distribution

Several years ago, I developed and published in Ocular Surgery News a method for scoring surgeon assertiveness by taking the percentage of cases, preop and pre-Brightness Acuity Test, with a BCVA of 20/40 or better. Five years ago, the average figure was 33%. In an early 2010 OSN update of the 2005 baseline study, the average came in at a significantly more assertive 54%, which is in line with observations in the field during client site visits. For this Twin Bridges study cohort of surgeons, the wide range was 16% to 94%; at 59%, the average was slightly more assertive than this year’s update study. (See Figure 5.)

Correlations between Twin Bridges score, surgeon parameters

While the number of surgeons and cases reviewed for this pilot study are not yet enough to draw firm conclusions or to control for the key variables, the following graphs suggest potential correlations between the Twin Bridges score, surgical time, cases per month and surgeon age.

Monthly case volumes vs. Twin Bridges score

In looking at the left half of the graph, one sees what one would expect: If you do more surgery, you get better at it and your score improves. However, if we look at the right half of the graph, we can seemingly only reach one of three conclusions: Either higher-volume surgeons are doing poorer work, we are not appropriately scoring surgical quality, or we do not have enough data points to yet draw any conclusions.

Holding the data at arm’s length and applying my experience with surgeons and their behavior as workloads increase, I believe that what will emerge as this study expands is that as a surgeon becomes more volumetrically successful, up to about 100 or so cases per month, his or her outcomes continue to improve, but that after this, whether due to fatigue or rushing, outcomes start to erode.

This is fairly sensitive territory, especially among those surgeons who aspire to higher volumes. Your impressions and feedback would be most welcome. (See Figure 6.)

Intraoperative time vs. Twin Bridges score

As a layman, I have often heard moderately paced surgeons castigate their faster colleagues, saying that the latter’s increased, brusque pace makes for sloppier or at least less gentle work. It is interesting, even in this smaller data cohort, to see this opinion potentially objectified in the graph below.

The slowest 10 surgeons had an average Twin Bridges score of 95.8; the fastest 10 surgeons had an average score of 94.3. Given the compressed nature of the present scoring system, in which nearly all of the 34 surgeons measured scored 92 or higher, these differences seem material. However, additional data points will be needed to validate this early impression. (See Figure 7.)

Surgeon age vs. Twin Bridges score

Unfortunately, when looking at age vs. score, we do not yet have sufficient data sets to control for case volumes or intraoperative time. However, by expanding the height of the y-axis in the graph (and squinting rather hard), it seems possible to recognize that the generational middle third of surgeons have slightly better, and less volatile, scores than younger surgeons, and that older surgeons are holding their own just fine. This will be interesting to return to when more data is available, especially from surgeons in their 60s and 70s, who are becoming increasingly common. (See Figure 8.)

Intraoperative time vs. monthly case volumes

Practice makes perfect — or it at least makes faster. The graph shows a strong correlation between monthly volumes and surgical tempo. (See Figure 9.)

Surgeon age vs. monthly case volumes

No surprises here. As might be expected, surgical volumes rise as careers mature, and then perhaps decline again. (See Figure 10.)

Intraoperative time vs. surgeon age

The overall tempo for this cohort of surgeons was faster than the average for all surgeons. In this small group, however, there appeared to be no meaningful correlation between age and tempo; there were relatively faster and slower surgeons at every career stage. (See Figure 11.)

The top 10 and bottom 10 surgeons in this cohort

One prefers to think that all surgeons are above average. All are certainly trained, licensed and vetted daily by ordinary market forces. Although more work remains, we might be able to say one day that the best outcomes are generated by providers who are a bit slower (more cautious, gentle or tissue-sparing?), a bit older (more experienced or conservative?) and performing somewhat more moderate case volumes. We shall see. For now, here is the breakdown:

The bottom 10 surgeons (based on Twin Bridges score only)

Had an average Twin Bridges score of 90.8
Had an average age of 49.2 years
Averaged 76 cases per month
Had an average intraoperative surgical time of 10.2 minutes
Had an average preop BCVA of 20/40 or better comprising 55% of cases

The top 10 surgeons (based on Twin Bridges score only)

Had an average Twin Bridges score of 99.3
Had an average age of 51.3 years
Averaged 60 cases per month
Had an average intraoperative surgical time of 13.9 minutes (n = seven for this average, due to data gaps)
Had an average preop BCVA of 20/40 or better comprising 60% of cases (n = seven for this average, due to data gaps)

Of the six surgeons performing more than 100 cases per month, three are in the bottom 10 group and one is in the top 10 group.

In conclusion

Everyone reading this issue of OSN is a professional of one kind or another. As a professional consultant, I would probably be a little defensive if someone came up with a 100-point scorecard for the way I advise surgeons. The professional practice administrators reading this might similarly resist any effort to pin a national numeric grade on their job performance. Seen in this light, cataract surgeons reading this report could, quite reasonably, be in a mood to throw some rocks at it.

However, we are entering a likely durable era of health care cost constraints, one in which fee-for-service may increasingly be replaced by fee-for-outcomes. We could anticipate that cost containment efforts in the future may include payer policies allowing only the most necessary surgery, provided by only the most competent individuals. There are certainly precedents in other advanced countries for only a subset of trained surgeons to be allowed to operate.

Ophthalmology has long lent itself to numbers. Visual function is graded numerically, as are many eye diseases. We hope that this modest study, for all its limitations, will stimulate OSN’s readers to join us in considering methods that are robust, accurate, reproducible, simple to calculate from pre-existing chart data and believable by surgeons themselves. Far better that ophthalmology should develop these measures from the bottom up than one day have them imposed from the top down.

I would like to express my deep appreciation to the surgeons and administrators of participating client practices, and to the surgeon review group for their most generous efforts to date. That said, please consider any errors, omissions or oversights in this preliminary report to be mine alone. We look forward to your comments and suggestions for improvement. With your help and a bit more effort, I am hopeful that we will be able to derive a concise, simple approach to fairly and simply evaluating cataract surgeon skills and procedural outcomes.

John B. Pinto is president of J. Pinto & Associates Inc., an ophthalmic practice management consulting firm established in 1979. He is the author of John Pinto’s Little Green Book of Ophthalmology; Turnaround: 21 Weeks to Ophthalmic Practice Survival and Permanent Improvement; Cash Flow: The Practical Art of Earning More From Your Ophthalmology Practice; The Efficient Ophthalmologist: How to See More Patients, Provide Better Care and Prosper in an Era of Falling Fees; The Women of Ophthalmology; and his new book, Legal Issues in Ophthalmology: A Review for Surgeons and Administrators. He can be reached at 619-223-2233; e-mail: pintoinc@aol.com; Web site: www.pintoinc.com.