Baldrige Excellence Framework: Examiners' Frequently Asked Questions

What is the potential impact of having several “what” questions in the overall questions in category 2?

The “what” questions set the context for the item, and in many cases, for other areas to address. If an organization doesn’t give responses to them, that certainly could be grounds for an OFI—as could a response to “What are your key strategic objectives?” that doesn’t appear to address the strategic challenges given in the Organizational Profile. As always, score the item holistically: consider the responses to the questions (“how” as well as “what” questions) as a group.

Why does 2.1b(1) have only “what” questions?

2.1b(1) asks the organization to identify its key strategic objectives. This set of questions provides context for other questions in the item and in other items. The responses are evaluated for their relationship to other item questions or key factors, and for their presence or absence. So why aren’t these questions in the Organization Profile? The Organizational Profile is often an organization’s first Baldrige-based assessment. The questions in 2.1b(1) are too difficult for a first assessment, and they would be out of context without the rest of the questions in 2.1.

How much benefit of the doubt do you give a subunit for things that are mandated by the parent? Is the parent a stakeholder?

To answer in reverse: yes, the parent is a stakeholder. The significance of that relationship will vary from organization to organization. Having a parent organization can be a mixed blessing. The parent may provide resources, support, and processes that the subunit needs; the parent may also require the subunit to use a corporate process that is less than ideal. Ultimately, the applicant is responsible for the efficacy and outcomes of the processes it uses. Therefore, sometimes a subunit will deserve a strength for something the parent prescribes, and sometimes the subunit has to accommodate a challenging process. Before you say that this isn't fair, keep in mind that examiners can’t exclude parts of the Criteria from consideration just because the parent has a strategy or process that requires a subunit to do something a certain way. The applicant is being evaluated against the standard of excellence. If the applicant uses a less-than-optimal process, it should do everything it can to optimize it, including working upstream with the parent organization. But also keep in mind the relative importance of that process. Whether you write an OFI, and how strong that OFI is, should reflect how important that part of the operation is to success and sustainability.

How should I handle responses that say “available on site”?

This type of response has grown in use, but it is risky for the organization because it calls for “benefit of the doubt.” You should give benefit of the doubt when the organization has provided sufficient evidence that it's warranted. Therefore, the applicant needs to provide enough evidence of its processes and results for you to confidently give the organization benefit of the doubt. For example, a blanket statement that comparative data or segmented data are available on-site without any evidence that the organization tracks and uses such data would likely not warrant benefit of the doubt. However, if such data are presented across several charts/graphs or even several items, and the organization states that additional data are available on-site, then benefit of the doubt may be warranted. Whether benefit of the doubt is appropriate should be a team discussion topic during consensus.

I was taught to write comments using the “NERD” formula. Is that the right approach?

“NERD” (Nugget—Examples—Relevance—Done) is a handy way to remember the three elements (NER) to include in a comment. But writing every comment in the NER order may (1) put the applicant to sleep and (2) more important, make your comments less effective. Consider whether the order N-R-E; R tucked inside the N, then E; or even E tucked inside the N, followed by R, makes for a stronger message. See Baldrige case study feedback reports for examples. And you aren’t necessarily Done at that point! Read the comment for effectiveness and clarity.

In writing OFI comments, is it OK to use “although”/”while”/”however” (e.g., “Although the applicant has a systematic approach to X, it does not show evidence of Y”)?

Starting an OFI comment with a long “although/while” statement focused on a strength can send a mixed message to the applicant (“Although you do A, B, C, D, you don't do E .... Is this a strength or an OFI?”). Instead, we recommend making the comment immediately actionable by pinpointing the OFI. For example, instead of “The applicant uses comparative data to assess its performance. However, it is not evident how organizations and performance dimensions are selected for comparisons,” we recommend something like this: “It is not evident how the applicant selects organizations and performance dimensions to include in the comparative data it uses.” That way, the OFI is upfront and clear, and the other information is simply the background for the OFI.

Why are national examiners asked to write "around 6" comments? Why not more, or fewer?

We ask you to write "around 6" comments to focus the organization on its most important strengths and OFIs without overwhelming the organization. (This is different from the practice in some Baldrige-based programs, which may ask for as few as 3 and as many as 12 comments.)

Why shouldn't we use OFIs to predict what will happen if the organization doesn't improve something (e.g., "failure to do X will result in Y"?

Baldrige feedback at the national level is actionable, but nonprescriptive and nonpredictive. Use “may” instead of “will”: “Doing X may help the organization do Y” or “Not doing X may result in Y,” or something similar.

When should I "double" a strength or OFI?

"Double" a strength when it is particularly significant to the organization's current success. "Double" an OFI when it represents a significant, current vulnerability for the applicant. A feedback report for an applicant at any level of maturity should contain some doubled strengths and OFIs to guide the applicant's reading of the feedback.

How many evaluation factors (ADLI, LeTCI) should I include in a comment?

It depends. :-) Generally, focus a comment on the one or two factors that are most significant for the applicant in light of its maturity level and key factors. For example, applicants at a high level of maturity may not benefit from reading that their approaches are systematic. On the other hand, this might be a significant strength for an applicant at a low level of maturity. Read and draw on the language in the scoring guidelines to create a useful, actionable comment for the applicant that helps it take the next step in maturity.

If an organization’s results for several measures are 100%, is a comparison required?

You probably wouldn’t give an organization an OFI for not providing comparisons for measures that are consistently performing at 100%. But access to those comparisons would give you a better idea of the significance of the strength: Are competitors also at 100%, or are they floating around 75%? If achieving 100% appears to be no big deal, then the strength, it may not be quite as important as a scenario where most other organizations struggle to achieve 75%.

If an applicant doesn't respond to a Criteria question, or responds with a low level of maturity, should I automatically write an OFI on it?

It depends. :-) The relative importance of each Criteria question will vary from organization to organization, and even within an organization over time based on its key factors. If you automatically write OFIs for each "missing" response, you are mistakenly treating the Criteria as a checklist. As a result, the organization may get opportunities for improvement that it sees as unimportant or irrelevant. How do you as an examiner deal with this? By seeing the organization as your customer. Consider what the organization says in its response. Using your and the team’s industry expertise, consider the key factors, the relative importance of the question to the organization's success and sustainability, and the organization's overall maturity level. For example, consider an organization that does not produce a traditional product or service, but instead creates knowledge and information. Supplier management may be less important to this organization than to one that needs to obtain materials and resources from many suppliers to assemble a product. To emphasize the fact that the Criteria questions vary in their importance, we use the term "questions" rather than "requirements" for the Criteria. We still say that award applicants should respond to all questions, and if they believe that one or more are not applicable, they should briefly explain why. You as an examiner may not agree with the applicant’s claim, and if you conclude (based on industry knowledge and key factors) that a particular question is important to the applicant, an OFI (with a clear statement of relevance) is certainly in order.

Do the Criteria specifically address timeliness/speed and effectiveness of decision making?

There are numerous places where the Criteria enable or imply timeliness and effectiveness of decision making. (1) Category 1: creating a successful organization; organizational agility; focus on action; setting the vision and values (to enable empowered and aligned workforce decision making). (2) Category 2: addressing change; agility; flexibility; strategic objectives with timelines; action plans with timelines; resource allocation; action plan modification. (3) Category 4: data, information, and systems to enable decision making; measurement agility; performance analysis and review; responding to rapidly changing organizational needs.

How would I write an OFI related to agility without being prescriptive?

You would relate the OFI to a Criteria question, such as those found in 1.1c(1), 2.1a(1), 2.2b, 4.1a(4), and 6.1a(3). The Criteria don't treat agility like a process. It's a characteristic (related to a core value). The Criteria ask how organizations do the things that enable organizational agility. So an OFI would be around the relevant process and how it could improve in creating, addressing, or enabling agility. That said, note that this is a prime example of how key factors can influence the relative importance of Criteria questions. Some industries and organizations have a much greater need for agility than others.

Where in the Criteria would I comment on how the organization determines who its competitors are?

P.2a(1) asks how many and what types of competitors an organization has, but the Criteria don’t ask how competitors are determined. As the notes make clear, competition may exist for customers, resources, and visibility, for example. An OFI might relate to Criteria questions that touch on this issue, such as 2.1a(3), on potential blind spots in strategic planning (see the notes to 2.1a[3]), and 3.2a, on market segments.

Let’s say an organization’s total number of employees meets the criterion for a small business (500 or fewer), but the organization has a large volunteer base that brings the total over 500. Should examiners treat it as a small business (and apply the Considerations for Small Organizations)?

Keep in mind that the “Considerations …” sheets are essentially reminders of the potential impact of certain key factors. Not all small organizations are the same, and you shouldn't assume that all those considerations automatically apply to every small organization you review. In this case, having a large volunteer workforce, especially relative to the size of the organization, will certainly affect your expectations for that organization. Some elements from the Considerations for Small Organizations may still apply, and others may not. Many other key factors will affect your determination of which should apply, including these: Is the organization part of a large, well-resourced parent organization? What type of work do the volunteers perform? How significant is that work? The possibilities are endless, and the Considerations for Small Organizations is by no means comprehensive or prescriptive.

Do responses to 1.2c(2), Community Support, have to be something beyond the organization’s normal operations or mission? If community support is a normal part of operations, does there need to be an external volunteer activity linked to the mission?

1.2c(2) elicits what an organization is doing above and beyond normal operations to support and strengthen its key communities. In fact, a note in the Business/Nonprofit Criteria speaks to this issue for nonprofit organizations. Does this support have to be a volunteer effort? No. Does it have to be related to the organization’s mission? No. Businesses supporting local schools, health care systems providing reading tutors, and schools hosting after-school sports camps are all examples that may not be directly related to mission.

What is the difference between “blend and correlate” data (4.2b[1]) and “integrate” data (4.1a[1])?

The difference in context has led to the use of slightly different terms. In 4.2b(1), the topic is managing organizational knowledge. Blending data from different sources refers to the need to handle, analyze, and use data and information of varying types, such as data tables, text, or even video. The question asks how you glean information/findings from such sources, correlate (determine the relationships among) data, and combine them into accurate and actionable knowledge. In 4.1a(1), the context is performance measures. The question is how the organization selects, collects, aligns, and integrates data and information. Here the meaning is integrating/incorporating data from different datasets/data sources into a new, single dataset that might provide deeper insights and knowledge than analyzing them separately would.

How should diversity be carried into areas/categories outside of category 5, Workforce?

First, note that diversity denotes more than race, ethnicity, religion, gender, and national origin. It also includes age, skill characteristics, ideas, thinking, academic disciplines, and perspectives. Also note that diversity in the Criteria is relative, not absolute. The Criteria ask about diversity in relation to the organization’s hiring and customer (student, patient) communities. Because the workforce does all the work, having an appropriately diverse workforce benefits the organization in everything it does, across all categories. It also impacts workforce engagement, customer engagement, community engagement, organizational learning, innovation, and agility.

6.1b in the Health Care Criteria includes questions about patient expectations and preferences that are not in the Business/Nonprofit and Education Criteria. Why aren’t these questions in the other two versions?

The questions are built into category 3 in all versions, but this extra section is included in the Health Care Criteria because patient requirements and preferences are particularly critical for the design and delivery of key work processes (patient care), often on a patient-by-patient basis.

For a health care applicant, where would I expect to see information on processes and results for patient safety? For example, does the safe working environment referred to 6.2c(1) include patients as well as the workforce? Or should processes for patient safety be described as key work processes in 6.1? Similarly, where do results for patient safety belong?

You would expect to see patient satefy results in 7.1b(1), to show the effectiveness of processes described to ensure patient safety in 6.1 (as a health care service process requirement and a patient expectation). 6.2c(1) deals with workplace safety (as it does in the Business/Nonprofit and Education Criteria). Results for patient satisfaction with safety might be in 7.2a(1), and, as noted in the Criteria Commentary, 7.4a(1) may include results related to leaders’ efforts to create and promote a culture of patient safety. Also remember that no matter where an organization reports information on processes or results, you should consider it wherever it is relevant to your evaluation.

What is the Baldrige definition of strategy?

When the Criteria glossary doesn’t include a term, that means the Criteria usage doesn’t significantly differ from the common, dictionary definition. “Strategy” is one of those terms: “a careful plan or method for achieving a particular goal, usually over a long period of time; a plan, method, or series of maneuvers or stratagems for obtaining a specific goal or result” (Merriam-Webster). At a higher level, someone well acquainted with the Baldrige Criteria might say, “an organization’s approach to addressing the future.”

What is the Baldrige definition of transformational change?

When the Criteria glossary doesn’t include a term, that means the Criteria usage doesn’t significantly differ from the common, dictionary definition. “Transformational change” is one of those terms: change that disrupts the status quo in an organization, forcing people out of their comfort zones, and likely causes a change in cultural norms for the organization. It is generally organization-wide and enacted over some time, but it is not the same as looking back over many years of evolutionary change and realizing there has been a transformation. Transformational change is leadership driven. Drivers might be a change in the business model, organizational strategy, or work systems due to dramatic regulatory changes (think health care today), disruptive innovation in the marketplace (think digital cameras), or new market opportunities (think joint ventures).

Are goals (targets) required in an application?

The Criteria ask, “What are your goals …?” in several places: 1.2b(1) related to regulatory and legal compliance, 2.1b(1) related to strategic objectives, and 5.1b(1) related to workforce environment improvement goals. Goals are not specifically asked for in category 7, but your assessment of the associated results for these areas will consider how well an organization is meeting or exceeding its stated goals.

If an organization has a goal of top 10% performance, and several measures show performance just below the top 10%, is that an OFI or a strength?

First, it is fair to give an OFI on failure to achieve a stated goal. Strengths for meeting a goal are not always appropriate unless the goals are anchored to objective high performance, such as top-decile performance. Consistent performance around top 10% is very good performance and likely worthy of strength comments. However, other factors besides achieving or not achieving the top 10%, such as trend data (have the results been improving or not?), competitor performance (are the results better or worse than competitors’?), and the stated importance of achieving the top 10% (are the measures critical ones for the organization?) should influence your feedback.

Does innovation mean internal process improvement (for efficiency/effectiveness) or external opportunity (strategic challenges and objectives)?

Yes to both. Innovation can be found in any aspect of an organization or its operations, from specific processes, to products and services, to work systems, to business models. The Baldrige definition encompasses all this: “making meaningful change to improve products, processes, or organizational effectiveness and create new value for stakeholders.” Innovation, though, is more than incremental process improvement. Process innovation means that the process is novel—brand new or new in its application to that type of business/industry.

What if the organization is not in an “innovative industry”? How do you score (and weight) 6.1d?

The nature of the industry and the role innovation plays in sustainability vs. strategic advantage will affect the extent to which you expect to see a robust process for pursuing opportunities for innovation and how much influence that has on your scoring. Obviously, by including innovation management in the overall questions, the Criteria are saying that pursuit of opportunities for innovation is important to success and long-term sustainability. However, your expectations for what that looks like will vary from organization to organization. For example, in the 2015 Casey Comprehensive Care Center for Veterans Case Study, you might not expect many opportunities for innovation in the cemetery administration’s products and services. But the organization might be able to improve its operational processes, enhance efficiency, and improve effectiveness through innovation.

How is the Baldrige definition of "innovation" different from the dictionary definition?

Elements that may differ include those in caps: Innovation is making meaningful change to IMPROVE products, processes, or organizational effectiveness AND CREATE NEW VALUE for stakeholders. In addition, an innovation may not be completely new--just NEW TO ITS PROPOSED APPLICATION. The outcome is a DISCONTINUOUS IMPROVEMENT in results, products, or processes.

How is innovation reflected in scoring?

The extent of innovation ("making meaningful change to improve products, processes ... and create new value for stakeholders") is included in the Learning dimension beginning at the 50-65% scoring range. (Some Baldrige-based programs break up the definition of innovation, with two different types parsed into two scoring ranges. The national program DOES NOT use this approach to scoring.)

If an organization doesn't show evidence of innovation, or doesn't show much innovation, can it score in the 50-65% range (or higher) for a process item?

Yes, it can. Innovation is only one element of Learning, which is only one of the four evaluation factors you take into account in scoring an item. The other elements in the Learning evaluation factor are evaluation and improvement, and organizational learning. No single evaluation factor, or as in this case, single element within an evaluation factor, should be used as a gate to prohibit the possibility of scoring in any particular range. As always, consider the applicant’s responsiveness holistically and choose the range for the item that is the most descriptive of the organization's performance, using the lens of the organization's key factors in making this judgment.

In writing results comments, is there a mathematical algorithm for determining when to write that "few," "many," and "most" results meet a certain question or evaluation factor?

No. A mathematical approach doesn't take the importance of results into account. Instead, look at the results that are relevant to the area to address you are writing about and to the organization's key factors, and make a holistic determination of how well they respond to the question and the evaluation factors (LeTCI).

Can I use results to determine the effectiveness of a process?

To a limited degree. Note the definition of “effective”: “How well a process or a measure addresses its intended purpose. Determining effectiveness requires (1) evaluating how well the process is aligned with the organization’s needs and how well is it deployed, or (2) evaluating the outcome of the measure as an indicator of process or product performance.” Performance certainly is an indicator that something is working well or not so well, but other factors also impact performance. You should not assume that unfavorable results come only from ineffective processes, any more than you would assume that favorable results automatically mean that processes are systematic, well deployed, regularly evaluated and improved, and well integrated and aligned. If results performance were due only to the maturity and effectiveness of processes, there would be no need to evaluate anything other than results.

The Criteria ask about customer requirements and expectations (3.2b[1]), but not what these are. 7.2a(2) asks for customer engagement results. How do I determine if what is presented in 7.2a(2) is responsive to what is important to customer engagement?

You would refer to P.1b(2), which asks for key customer requirements and expectations. Engagement comes from meeting these requirements and exceeding expectations, and from other aspects of building relationships (3.2b).

What does “holistic scoring” mean?

In scoring, we ask you to determine the scoring range that is “most descriptive” of performance. We chose the term “holistic” for this determination deliberately, with attention to the dictionary definition—the idea that the whole is more than merely the sum of its parts. An analogy might be the blind men/elephant story, where each man is aware of one part of a complex whole. Depending on what part of the elephant each man observes, he comes up with a different description of a very complex animal—and none of these descriptions is accurate. Holistic scoring is not an exact science; nor is it meant to be. It is interpretive. Total consistency of individual scoring at Independent Review is not the goal. Variable IR scoring—by examiners from a variety of backgrounds—leads to rich discussion during Consensus Review. It is this discussion among the examiners that leads to a more complete understanding of the applicant and thus more accurate scoring. If scoring were completely consistent at IR, we would not need CR. (Some Baldrige-based programs use "scoring calibration" guidelines and "gates" to block higher scores. The national program DOES NOT use this approach to scoring.)

What is the added value of holistic scoring to the organization? Why not use a more precise method of arriving at a score?

The added value is increased accuracy—scores that reflect the whole applicant. It’s a matter of aiming for validity, not just reliability. In holistic scoring, no one evaluation factor should serve as a gate that keeps the score out of a higher range. Using a somewhat mathematical formula for scoring (where one factor is a gate) can result in a lower, less accurate score. Also, if a formulaic approach were possible, we wouldn’t need Consensus Review.

Give an example in which approach is scored lower than deployment, learning, and integration.

Here’s an example: an approach is responsive to the overall questions (50–65%). It is well deployed, with no significant gaps, with systematic evaluation and improvement and organizational learning used as key management tools, and it is integrated with current and future organizational needs (70–85%). The organization might well score 70–85% for this item. This scenario may not be very common, but it is certainly possible. Your score should be the result of a holistic assessment of all four factors to determine which range best describes the applicant’s maturity level. The approach element may be useful as an indicator of where to begin the conversation of which range to choose, but not as a barrier to higher levels of scoring.

Give an example of how using one evaluation factor as a gate might result in a less accurate score.

Often, the evaluation factors that seem to (wrongly) hold back scores are deployment and learning. For example, an organization may have a systematic approach that is integrated with organizational needs, but deployment to a remote location or a recently acquired unit is in the early stages. Some examiners may (wrongly) keep the applicant out of a higher range because of these minor cases of lack of deployment. Similarly, an approach may be effective and systematic, well deployed, and integrated with organizational needs, but there is no innovation associated with it. To allow this factor to depress the score is also inaccurate. Comments support and supplement the score. Together, they tell the applicant where it stands. (Unlike some Baldrige-based programs, the national program DOES NOT ask you to identify "blocking OFIs" that can capitate scores.)

Do you do interrater and intrarater reliability tests among teams to reduce variance in scoring and in comments?

No. We expect, and even want, some variation across team members during Independent Review. This process requires judgment and interpretation, and Consensus Review completes the process of figuring out where the organization stands with regard to scoring. The concern comes when the variation is excessive. We do pay attention to variation and have been working to decrease it through training.

In numerous instances in the Criteria (such as 2.2[b], 4.1a[4], and 4.2b(3), there are only overall questions and no multiple questions. Won’t this lower scores, since organizations never respond to multiple questions?

There is a misunderstanding here. The bolded overall questions ARE multiple questions. They are the most important and/or foundational of the multiple questions, so we labeled them “overall,” but they are still multiple questions. In areas to address with only bolded overall questions, an organization that meets the overall question also meets the multiple questions. This might increase scores—except for the fact that examiners don’t score organizations question by question. They look at the whole item and all the scoring factors, not just the “approach” factor.

To score in the 70–85% range for an item, does an organization need to be responsive to all the multiple questions in that item?

No. “Fully responsive to the multiple questions” reflects the approach description in the 90–100% range. That means that an organization scoring 70–85% would probably have some gaps. The significance of those gaps will impact where within the range the score falls, but you should not expect an organization to be fully responsive to the multiple questions to score in the 70–85% range. Of course, you would look at the organization’s performance on all the evaluation factors and choose the most descriptive range. You wouldn’t choose the score based only on the approach descriptor.

BALDRIGE PERFORMANCE EXCELLENCE PROGRAM