Development of Official Written Certification Examinations
The ACES Int'l Certified Damage Prevention Specialist™ Official Written Certification Examinations were developed by leading Industry Subject Matter Experts as a Norm-based knowledge assessment tool. Each of the questions in the ACES Int'l Certified Damage Prevention Specialist™ Official Written Certification Examination question pool cover several competency areas and include highly advanced and scientific methods of test instrument development supervised by an ACES Int'l certification program written examination developer. In addition, ongoing real-time examination test instrument monitoring is provided by ACES Int'l to the Program's Certification Administers (Examination Session Monitors) and Training Providers after each individual written certification examination session. These real-time electronic feedback messages ensure that:
A comparison is made between the population sample (current written certification examination session) and the total population (all previous written certification examination sessions)
Any examination session anomalies (e.g. all students in the exam session got the answer to the same question wrong) are identified and the Certification Administrator and Training Providers are aware of these anomalies
An individual examination session and total population examination session "history" is recorded for study and review
Recorded total population examination sessions are studied for individual test question validity and reliability
ACES Int'l Certified Damage Prevention Specialist™ Official Written Certification Examinations are a secure knowledge measurement tool. All ACES Int'l Certification Administrators must read and sign "Examination Guideline" forms which outline the responsibilities of Examiners (Certification Administrators) and provide for the following:
Examination Security Requirements
Appropriate Examination Room Preparation
Use of Test Manipulatives
Coding of Identifying Information on Student Answer Documents, Forms, and Written Examination Booklets Including Special Codes
Identifying and Reporting Examination Irregularities
Handling Emergencies During Examination Sessions
Secure Examination Handling Procedures and Storage Requirements
Student Identity Protection
Special Procedures for Examinees with Identified Special Needs or Disabilities for Accomodations
Certification Fee Processing Requirements and Payments
Handling and Returning Examination Materials After Testing
Unofficial Test Results Sharing
.In addition, ACES Int'l maintains an examination session data base which is made available to Examination Review Committee Members which can then use this data to scientifically verify individual examination questions for validity, reliability and deviation. Here are some brief explanations of the types of Basic Measurement Concepts that ACES Int'l may use in its Official Written Certification Examination Development and Assessments:
Ability: A characteristic indicative of an individual's competence in a particular field. The word "ability" is frequently used interchangeably with aptitude, although some psychologists may use "ability" to include "aptitude" and "achievement".
Achievement/Ability Comparison (AAC): The relationship between an individual's score on an examination and the scores of other examinees of similar ability or backgrounds.
Aptitude: A combination of characteristics, whether instilled or acquired, that are indicative of an individual's ability to learn or to develop proficiency in some particular area if appropriate education or training is provided.
Calibrated Difficulty Level: A scale value expressing the difficulty level of a test item. This value is different than the conventional difficulty index. The origin of this scale is arbitrary, but the lower the value of the item, the easier that item is.
Correlation: The degree of relationship between two sets of test scores. A correlation of 0.00 denotes a complete absence of relationship. A correlation of plus or minus 1.00 indicates a relatively perfect (either positive or negative) relationship. Correlation coefficients are used to estimate test reliability and validity.
Criterion-Referenced (Content-Referenced) Examination: A term that describes examinations that are designed to provide information about the specific knowledge or skills possessed by an examinee. These types of examinations cover a relatively small unit of content and are closely related to instruction. Criterion-Referenced scores have meaning in terms of what the examinee knows or can do, rather than in (or in addition to) their relation to the scores made by some norm group. Most frequently, the meaning is given in terms of a cut-off score, for which people who score above that point are considered to have scored adequately (show "competence" or "master" the material), while those who score below it are thought to have inadequate scores ("not competent").
Deviation Score (x): The score for an individual minus the mean score for the group; e.g., the amount a person deviates from the mean.
Diagnostic Test: A test used to "diagnose" or analyze an individual's specific areas of weaknesses or deficiencies, and if possible, to suggest a cause.
Difference Score: Difference between 2 scores for the same individual.
Difference Score Reliability: Reliability of the distribution of differences between two sets of scores.
Difficulty Index: The percent of students who answer an item correctly, designated as p. (Also, at times defined as the percent who respond incorrectly, designated as q).
Discrimination Index: The extent to which an item differentiates between high-scoring and low-scoring examinees. Discrimination indices generally range from -1.00 to +1.00. Usually, the higher the discrimination index, the better the item is considered to be. Items with negative discrimination indices are generally items that need to be rewritten.
Distracters: An incorrect answer choice in a multiple-choice item, also called a "foil".
Equivalent Forms: Any of 2 or more forms of an examination that are closely parallel with respect to content and the number and difficulty of the items included in these examinations. Equivalent forms should yield very similar average scores and measures of variability for a given population. These forms are also called "parallel" forms.
Error of Measurement: The amount by which the score actually received (an observed score) differs from a hypothetical true score.
Frequency: The number of times a given score (or set of scores in an interval grouping) occurs in a given distribution.
Frequency Distribution: A tabulation of scores from low to high or from high to low which shows the number of individuals who obtain each score or fall within each score interval.
Gain Score: The difference between a post-test score and a pre-test score.
Item Analysis: The process of examining examinees' responses to examination items in order to judge the quality of each item. The difficulty and discrimination indices are frequently used in this process.
Item Difficulty: see Difficulty Index
Item Discrimination: see Discrimination Index
Latent-Trait Scale: A scaled score obtained through one of several mathematical approaches collectively known as "Latent-Trait Procedures" or "Item Response Theory". The particular mathematical values assigned are arbitrary, but higher scores indicate more knowledgeable people or more difficult items.
Local Percentile: see Percentile
Mastery Level: The cutoff score on a criterion-referenced or mastery examination. Those who score at or above the cutoff score are considered to have "mastered" the material; those who score below the cutoff score are considered to be "Non Masters". "Mastery" when used in this reference is an arbitrary judgment. A cutoff score can be determined by several different methods or designs. Each method or design results in a different cutoff score.
Mastery Examination: An examination designed to determine whether an individual has mastered a given unit of instruction or a single knowledge or skill; an examination giving information on what an examinee knows, rather than on how his or her performance relates to that of some norm group or set of norm groups.
Mean: The arithmetic average of a set of test scores. You can determine the mean by adding all scores in the distribution and dividing by the total number of scores.
Median: The middle score in a distribution set or set of ranked scores; the point or score that divides a group of scores into two equal halves; also known as the "50th percentile". In all cases, half of the scores are above the median and half of the scores are below the median.
Mode: The score or value that occurs most frequently in a distribution set.
N: The mathematical symbol used to represent the number of cases in a group.
National Percentile: see Percentile
Normal Curve Equivalents (NCEs): The normalized standard score with a mean of 50 and a standard deviation of 21.06. see Standard Score. The standard deviation of 21.06 was chosen so that NCEs of 1 and 99 are equivalent to percentiles of 1 and 99. There are approximately 11 NCEs to each stanine. see Stanines.
Normal Distribution: A distribution of scores or other measures that in graphic form has a distinctive bell-shaped curving appearance. In a normal distribution, the measures are distributed symmetrically about the mean. Cases are concentrated close to the mean and decrease in frequency, according to a precise mathematical equation, the farther cases are plotted from the mean. The assumption that many mental and psychological characteristics are distributed normally is very useful in examination development processes of ACES Int'l.
Norms: The distribution of examination scores of some specified group called a norm group.
Norms vs. Standards: Norms are very different from Standards. Norms are indicators of what a group having similar characteristics did when confronted with the same examination items as those taken by a group in the norms group. Standards, very dissimilarly, are arbitrary judgments of what a group should be able to do, given a set of examination items.
Norm-Referenced Examination: Any examination where the score acquires additional meaning by comparing it to the scores of people in an identified norm group. An examination can be both norm and criterion-based or "referenced". Most standardized Certification Examinations are referred to as "Norm-Referenced".
Objectives: Stated, desired outcomes of education or training.
Out-of-Level Examination: The activity of administering an examination level that is different from the one designed or designated for an individual of a particular age or background.
p-Value: The proportion of individuals in an identified norm group who answer an examination item correctly; usually referred to as the difficulty index. see Difficulty Index
Percentile: A reference point on the norms distribution below which a certain percentage of the scores fall. For example, if 75% of the scores fall below a raw score of 48, then the score of 48 is at the 75th percentile. The term "local percentile" would indicate that the norm group is obtained locally. The term "national percentile" indicates that the norm group represents a national or total population group.
Percentile Band: An interpretation of an examination score that takes into account certain measurement errors. These bands, which are most useful in delineating significant differences between subtests in battery profiles, most often represent the range from one stand error of measurement below the obtained score to one standard error of measurement above it. For example, if an examinee had a raw score of 45, and if the standard error of measurement were 5, the percentile rank for a score of 40 to the percentile rank for a score of 50 would be the percentile band. Given this scenario, we would be 68% confident that the student's true percentile rank falls within this band. (see Standard Error of Measurement and True Score.
Percentile Rank: The percentage of scores falling below a certain point on a score distribution. (Percentile and Percentile Rank are sometimes used interchangeably).
Profile: A graphic representation of several or more scores expressed in comparable units of measurement for an individual or a group. This method of presentation allows for easy identification of relative strengths or weaknesses across different examinations or sub-examinations.
Quartile: One of three points that divides the scores in a distribution into 4 groups of relatively equal size. The 1st Quartile, is the 25th percentile, separates the lowest 1/4 of the group; the middle or 2nd Quartile, the 50th percentile or median, divides the 2nd 1/4 of the cases from the third, and the 3rd Quartile, the 75th percentile, separates the top 1/4 of the group.
Raw Score: The number of correct responses an individual scored on the examination in a numerical value. Even though raw scores are of value, they should never be used to make comparisons between performances on dissimilar examinations, unless other information is obtained concerning the other examination completed. Raw scores can lead to false assumptions unless they are weighted on other factors including number of total questions on an examination, norm group performance and average or mean scores for a given examination.
Readiness Test: A mark of measurement used to determine an individual's present level of skills necessary to undertake a new learning activity or a greater degree of a knowledge base.
Regression Effect: The tendency of post-test scores to be closer to the mean of its distribution. Post-test scores tend to fall either below the level of pretest scores or above the level of pretest scores and generally tend to be closer to the mean of its distribution. This explains why examinees scoring very high on pretest scores generally regress to the mean of the distribution of post-tests and why examinees scoring very low on pretest scores score higher or raise to the level of the mean of the distribution of post-tests.
Reliability: The degree or extent to which examination scores are consistent; the extent to which examination scores are relatively free from random errors of measurement and are dependable. Reliability is most commonly expressed in the form of a reliability coefficient. The higher the reliability coefficient, the fewer the random errors in the scores and thus an examination has more reliability.
Reliability Coefficients: An estimation through correlation between scores on 2 similar or equivalent forms of an examination, or by the correlation between 2 scores of the same examination, or through internal-consistency estimates or theory. Commonly used estimates of reliability include the Kuder-Richardson Formula which involves the number of items in the examination, the mean of the examination and the variance of the test.
Reliability of Difference Scores: see Difference Score Reliability
Scaled Score: A mathematical scaling of raw scores.
Scaled Score Band: The scaled score plus and minus one standard error of measurement on the scale score metric. see Standard Error of Measurement and True Score.
Standard Deviation (S.D.): The measure of variability, or "spread" of a distribution of scores. The greater number of scores that cluster around the mean, the smaller the standard deviation. In normal distributions of scores, 68.3% of the scores are within one standard deviation below the mean and one standard deviation above the mean. The way to compute the standard deviation (or S.D.) is based upon the square of the deviation of each score from the mean.
Standard Error of Measurement (SEM): The total amount that an observed score is expected to fluctuate around the true score. According to this model, the obtained score will not differ by more than plus or minus one standard error from the true score about 68% of the time and about 95% of the time, the obtained score will differ by plus or minus 2 standard errors from the true score.
Standard Score: A term for scores that have been converted from their raw scores to be more comparable and easier to interpret. A basic type of Standard Score is called a "z-score" which is expressed as the deviation of a score from the mean score of the group in relation to the standard deviation (SD) of the scores of the group. Standard scores are most often linear-equated z-scores with different means and standard deviations. see z-Score.
Standards: see Norms vs. Standards
Stanines: A nine-point normalized standard score scale with a mean of 5 and a standard deviation of 2, with only the integers of 1 to 9 occurring. Score percentage at each stanine is 4, 7, 12, 17, 20, 17, 12, 7 and 4 respectively.
T-Score: A standard score with a mean of 50 and a standard deviation of 10.
True Score: A score without error... the "Holy Grail" of examination developers. It is a hypothetical value that can never be realized by testing because a test score always involves some measurement error. A "true" score can be approximated as the average of an infinite number of measurements from exactly the same tests, assuming no practice effect or change in the examinee during testing. The SD of this infinite number of scores is known as the standard error of measurement. see Standard Error of Measurement.
Validity: How "right" the examination measures what it is intended to measure. It is a form of evidence that an examination is performing as expected. There are 3 basic types of validity: Content Validity; Criterion-related validity; and Construct validity.
Variability: The dispersion or "spread" of examination scores also expressed as a standard deviation (SD). see Standard Deviation.
Variance: The square of the standard deviation (SD).
Weighting: The process of assigning different values or "weights" to items or scores in order to make a final decision. Conversion of all scores to a common scale or metric is necessary prior to assigning weights.
z-Score: A type of standard score with a set mean of zero and a standard deviation (SD) of one. see Standard Score
If your Industry is in need of personal Certification Program development, ACES Int'l has the knowledge and experience to help you.
07/22/06
All rights reserved