KDOQI Clinical Practice Guidelines for Chronic Kidney Disease: Evaluation, Classification, and Stratification




THE OVERALL AIM of the project was to develop a classification of the stages of chronic kidney disease, irrespective of the underlying cause of the kidney disease, and a clinical action plan for the evaluation and treatment of chronic kidney disease. This classification could then be transformed to an “evidence model” for future development of additional practice guidelines regarding specific diagnostic evaluations and therapeutic interventions (Executive Summary).

The Work Group sought to develop an “evidence base” for the classification and clinical action plan, derived from a systematic summary of the available scientific literature on: the evaluation of laboratory measurements for the clinical assessment of kidney disease; association of the level of kidney function with complications of chronic kidney disease; and stratification of the risk for loss of kidney function and development of cardiovascular disease.

Two products were developed from this process: a set of clinical practice guidelines regarding the classification and action plan, which are contained in this report; and an evidence report, which consists of the summary of the literature. Portions of the evidence report are contained in this report. The entire evidence report is on file with the National Kidney Foundation.

Overview of Process

Development of the guideline and evidence report required many concurrent steps:

Creation of Groups

The Co-Chairs of the KDOQI Advisory Board selected the Work Group Chair and Director of the Evidence Review Team, who then assembled groups to be responsible for the development of the guidelines and the evidence report, respectively. These groups collaborated closely throughout the project.

The Work Group consisted of “domain experts,” including individuals with expertise in nephrology, epidemiology, laboratory medicine, nutrition, social work, pathology, gerontology, and family medicine. In addition, the Work Group had liaison members from the National Institute of Diabetes, Digestive and Kidney Diseases and from the National Institute on Aging. Midway through the project, at the request of the KDOQI Advisory Board, the Work Group expanded the target population to include children and invited additional members with expertise in pediatric nephrology. The first task of the Work Group members was to define the overall topic and goals, including specifying the target condition, target population, and target audience. They then further developed and refined each topic, literature search strategy, and data extraction form (described below). The Work Group members were the principal reviewers of the literature, and from these detailed reviews they summarized the available evidence and took the primary roles of writing the guidelines and rationale statements.

The Evidence Review Team consisted of nephrologists (one senior nephrologist and three nephrology fellows) and methodologists from New England Medical Center with expertise in systematic review of the medical literature. They were responsible for coordinating the project, including coordinating meetings, refinement of goals and topics, creation of the format of the evidence report, development of literature search strategies, initial review and assessment of literature, and coordination of all partners. The Evidence Review Team also coordinated the methodological and analytic process of the report, coordinated the meetings, and defined and standardized the methodology of performing literature searches, of data extraction, and of summarizing the evidence in the report. They performed literature searches, retrieved and screened abstracts and articles, created forms to extract relevant data from articles, and tabulated results. Throughout the project, and especially at meetings, the Evidence Review Team led discussions on systematic review, literature searches, data extraction, assessment of quality of articles, and summary reporting. In addition, a member of the Evidence Review Team (BCA) at Johns Hopkins Medical Institutions assisted Dr. Coresh in analysis of NHANES III data.

Development of Topics

The goals of the Work Group spanned a diverse group of topics, which would have been too large for a comprehensive review of the literature. Based on their expertise, members of the Work Group focused on the specific questions listed in Table 8 and employed a selective review of evidence: a summary of reviews for established concepts (review of textbooks, reviews, guidelines, and selected original articles familiar to them as domain experts) and a review of primary articles and data for new concepts.

Refinement of Topics and Development of Materials

The Work Group and Evidence Review Team developed (a) draft guideline statements; (b) draft rationale statements that summarized the expected pertinent evidence; (c) mock summary tables containing the expected evidence; and (d) data extraction forms requesting the data elements to be retrieved from the primary articles to complete the tables. The development process included creation of initial mock-ups by the Work Group Chair and Evidence Review Team followed by iterative refinement by the Work Group members. The refinement process began prior to literature retrieval and continued through the start of reviewing individual articles. The refinement occurred by e-mail, telephone, and in-person communication regularly with local experts and with all experts during in-person meetings of the Evidence Review Team and Work Group members.

Data extraction forms were designed to capture information on various aspects of the primary articles. Forms for all topics included study setting and demographics, eligibility criteria, causes of kidney disease, numbers of subjects, study design, study funding source, population category (see below), study quality (based on criteria appropriate for each study design, see below), appropriate selection and definition of measures, results, and sections for comments and assessment of biases.

The various steps involved in development of the guideline statements, rationale statements, tables, and data extraction forms were piloted on one of the topics (bone disease) with a Work Group member at New England Medical Center. The “in-person” pilot experience allowed more efficient development and refinement of subsequent forms with Work Group members located at other institutions. It also provided experience in the steps necessary for training junior members of the Evidence Review Team to develop forms and to efficiently extract relevant information from primary articles. Training of the Work Group members to extract data from primary articles subsequently occurred by e-mail as well as at meetings.

Relevance and Appropriateness of Study Designs

Throughout the process of refinement of topics, the types of study design that would be relevant and appropriate to answer the questions posed in Table 8 were carefully considered.

Classification of Stages

Defining the stages of severity was an iterative process, based on expertise of the Work Group members and synthesis of evidence developed during the systematic review. After defining the stages of severity, it was necessary to estimate the prevalence of each stage (albuminuria or proteinuria as a marker of kidney damage, decreased GFR, kidney failure) in the general population. The ideal study design to assess prevalence would be a cross-sectional study of population representative of the general population. Criteria for evaluation of cross-sectional studies to assess prevalence are listed in Table 150.

Data from NHANES III were fortunately available for some of these analyses. The methods for analysis of data from NHANES III are described in Appendix 2. In addition, articles from studies of community screening were included. For these studies, the relevant result is the estimate of prevalence, expressed as a percent, with the absolute number of individuals derived by extrapolation to the US population, where possible.

Evaluation of Laboratory Tests

Evidence was required to assess the performance of diagnostic tests (prediction equations for GFR, spot urine samples for protein-to-creatinine and albumin-to-creatinine ratios, and new urinary markers of kidney damage) for the evaluation of severity of chronic kidney disease. The ideal study design for diagnostic test evaluation would be a cross-sectional study of a representative sample of patients who are tested using the “gold” (criterion) standard as well as the newer test. Criteria for evaluation of studies of diagnostic tests are listed in Table 151.655

For these studies, the relevant result is the measure of performance (bias and precision) of the new test.

Association of Level of GFR With Complications

The appropriate study to assess the association of level of GFR with complications would be a cross-sectional study of a representative sample of patients with chronic kidney disease in whom the level of kidney function is related to the presence or absence or severity of a complication. In addition, baseline data from a longitudinal study would be appropriate. Principles of cross-sectional studies to assess associations are described in Table 152.

For some complications, data from NHANES III were available. However, the NHANES III database includes relatively few patients with severely decreased GFR (15 to 29 mL/min/1.73 m2); therefore, it was also desirable to use cross-sectional studies, baseline data from longitudinal studies, and case series of patients with decreased GFR. Data from baseline assessments of patients enrolled in the Canadian Multicentre cohort study of patients with chronic kidney disease were used for Figures 28, 29, 36, 37, 38, 40, and 42.288 Data from all 446 patients enrolled from 1994 to 1997 were available.

Studies that provided data for various levels of kidney function were preferred; however, if data were sparse, studies that provided only the mean level of kidney function were included. Members of the Work Group provided individual patient data that were used for some analyses.

Stratification of Risk (Prognosis)

The appropriate study to assess the relationship of risk factors to loss of kidney function and development of cardiovascular disease would be a longitudinal study of a representative sample of patients with chronic kidney disease with prospective assessment of factors at baseline and outcomes during follow-up. Because it can be difficult to determine the onset of chronic kidney disease and cardiovascular disease, prospective cohort studies were preferred to case-control studies or retrospective studies. Clinical trials were included, with the understanding that the selection criteria for the clinical trial may have lead to a non-representative cohort. Criteria for evaluating studies of prognosis are described in Table 153.656

Of particular importance is multivariable analysis to control for confounding by factors other than the variables of interest (for example, confounding by age in studies of factors related to cardiovascular disease events). Because of the well-known association between diabetes and cardiovascular disease, diabetic and nondiabetic patients were considered separately. The association between diabetic kidney disease and other diabetic complications was evaluated using reviews of cross-sectional studies and selected primary articles of cohort studies. The association between nondiabetic kidney disease and cardiovascular disease was evaluated using several strategies: reviews and selected primary articles of incidence rates of cardiovascular disease in patients with nondiabetic kidney disease; reviews and selected primary articles of cross-sectional studies of the prevalence of risk factor levels in patients with nondiabetic kidney disease; and a systematic search for cohort studies of the relationship between albuminuria or proteinuria and decreased GFR with subsequent cardiovascular disease events in nondiabetic individuals.

Literature Search

The Work Group and Evidence Review Team decided in advance that a systematic process would be followed to obtain information on topics that relied on primary articles. In general, only full journal articles of original data were included. Review articles, editorials, letters, or abstracts were not included (except as noted). Though reports of formal studies were preferred, case series were also included. No systematic process was followed to obtain textbooks and review articles.

Studies for the literature review were identified primarily through Medline searches of English language literature conducted between February and June 2000. These searches were supplemented by relevant articles known to the domain experts and reviewers.

The Medline literature searches were conducted to identify clinical studies published from 1966 through the search dates. Separate search strategies were developed for each topic. Development of the search strategies was an iterative process that included input from all members of the Work Group. Search strategies were designed to yield approximately 1,000 to 2,000 titles each. The text words or MeSH headings for all topics included kidney or kidney diseases or kidney function tests. The searches were limited to studies on humans and published in English and focused on either adults or children, as relevant. In general, studies that focused on hemodialysis or peritoneal dialysis were excluded. The Medline search strategies are included in the Evidence Report.

Medline search results were screened by clinicians on the Evidence Review Team. Potential papers for retrieval were identified from printed abstracts and titles, based on study population, relevance to topic, and article type. In general, studies with fewer than 10 subjects were not included (except as noted). After retrieval, each paper was screened to verify relevance and appropriateness for review, based primarily on study design and ascertainment of necessary variables. Some articles were relevant to two or more topics. A goal was set of approximately 30 articles per topic. In many cases, the goal was exceeded. Domain experts made the final decision for inclusion or exclusion of articles. All articles included were abstracted and contained in the evidence tables.

Table 154 details the literature search and review for each topic. Overall, 18,153 abstracts were screened, 1,110 articles were reviewed, and results were extracted from 367 articles.

Format for Evidence Tables

Two types of evidence tables were prepared. Detailed tables contain data from each field of the components of the data extraction forms. These tables are contained in the evidence report but are not included in the manuscript. Summary tables describe the strength of evidence according to four dimensions: study size, applicability depending on the type of study subjects, results, and methodological quality (see table on the next page, Example of Format for Evidence Tables). Within each table, studies are ordered first by methodological quality (best to worst), then by applicability (most to least), and then by study size (largest to smallest).

Study Size

The study (sample) size is used as a measure of the weight of the evidence. In general, large studies provide more precise estimates of prevalence and associations. In addition, large studies are more likely to be generalizable; however, large size alone does not guarantee applicability. A study that enrolled a large number of selected patients may be less generalizable than several smaller studies that included a broad spectrum of patient populations.


Applicability (also known as generalizability or external validity) addresses the issue of whether the study population is sufficiently broad so that the results can be generalized to the population of interest at large. The study population is typically defined by the inclusion and exclusion criteria. The target population was defined to include patients with chronic kidney disease and those at increased risk of chronic kidney disease, except where noted. A designation for applicability was assigned to each article, according to a three-level scale. In making this assessment, sociodemographic characteristics were considered, as were the stated causes of chronic kidney disease and prior treatments. If a study is considered to be not fully generalizable, reasons for lack of applicability are reported in the detailed tables on file at the NKF.

GFR Range

For all studies, the range of GFR (or creatinine clearance [CCr]) is represented graphically when available. The mean or median GFR is represented by a vertical line, with a horizontal bar approximating the 95% coverage interval. Studies without a vertical or horizontal line did not provide data on the mean/median or range, respectively. When data were available, the range was calculated as: Range = mean GFR ± 1.96 × (standard deviation).

When sufficient data were not available, the range was estimated from the full range of GFR levels reported, from the median GFR, or from available graphs. For studies that reported creatinine clearance instead of GFR, the mean and range of creatinine clearance were used to estimate GFR. For studies that reported neither GFR nor creatinine clearance, the mean level of serum creatinine (±standard deviation and/or range) is listed as text (eg, SCr = 3.4 ± 0.3 mg/dL).


In principle, the study design determined the type of results obtained. For studies of prevalence, the result is the percent of individuals with the condition of interest. For diagnostic test evaluation, the result is the strength of association between the new measurement method and the criterion standard. In addition to evaluating the size of correlations and regression coefficients, bias and precision of GFR estimate equations were also considered. For studies of the association between the level of GFR and complications, the result is direction and strength of the association. In addition to examining continuous relationships (correlations and regressions), the prevalence of complications for levels of GFR corresponding to stages of chronic kidney disease were estimated. For studies of prognosis, the result is the factor and the direction and strength of the association between the risk factor and outcome. Associations were represented according to the following symbols:

The specific meaning of the symbols is included as a footnote for each table.

For studies that provided only single point estimates (such as the mean value) of complications, those values are presented instead of data on association with level of GFR. Studies that reported strength of association of an outcome with GFR are listed and ranked separately from those that simply reported mean levels, with shading used to visually distinguish them.


Methodological quality (or internal validity) refers to the design, conduct, and reporting of the clinical study. Because studies with a variety of types of design were evaluated, a three-level classification of study quality was devised:

Summarizing Reviews and Selected Original Articles

Work Group members had wide latitude in summarizing reviews and selected original articles for topics that were determined, a priori, not to require a systemic review of the literature. The use of published or derived tables and figures was encouraged to simplify the presentation.

Translation of Evidence to Guidelines


This document contains 15 guidelines. The format for each guideline is outlined in Table 155.

Each guideline contains one or more specific “guideline statements,” which are presented as “bullets” that represent recommendations to the target audience. Each guideline contains background information, which is generally sufficient to interpret the guideline. A discussion of the broad concepts that frame the guidelines is provided in the preceding section of this report. The rationale for each guideline contains definitions and classifications of markers of disease (if appropriate) followed by a series of specific “rationale statements,” each supported by evidence. The guideline concludes with a discussion of limitations of the evidence review and a brief discussion of clinical applications, implementation issues and research recommendations regarding the topic.

Strength of Evidence

Each rationale statement has been graded according the level of evidence on which it is based (see the table, Grading Rationale Statements).

Limitations of Approach

While the literature searches were intended to be comprehensive, they were not exhaustive. Medline was the only database searched, and searches were limited to English language publications. Hand searches of journals were not performed, and review articles and textbook chapters were not systematically searched. In addition, search strategies were generally restricted to yield a maximum of about 2,000 titles each. This approach required the exclusion of some topics from searches. However, important studies known to the domain experts that were missed by the literature search were included in the review. In addition, essential studies identified during the review process were also included.

Exhaustive literature searches were hampered by limitations in available time and resources that were judged appropriate for the task. The search strategies required to capture every article that may have had data on each of the questions frequently yielded upwards of 10,000 articles. The difficulty of finding all potentially relevant studies was compounded by the fact that in many studies, the information of interest for this report was a secondary finding for the original studies.

Due to the wide variety of methods of analysis, units of measurements, definitions of chronic kidney disease, and methods of reporting in the original studies, it was often very difficult to standardize the findings for this report.