Uploaded by Kajela Hambisa

Assessment(1)

advertisement
Instructor:
Faculty of Educational and Behavioral Sciences, BDU
Course Objectives
At the conclusion of this course you are expected to:
• Understand concepts related to student learning assessment
• Develop techniques for assessing the performance of students
based on sound principles and educational objectives
• Analyze items to increase the fit for purpose of classroom
assessment tools.
• Interpret assessment results to understand the implications and
thereby make appropriate decisions.
2
Course Objectives contd...
• Conduct self-assessment of their teaching in classrooms in
view of student learning and standards of teacher
professionalism.
• Adhere to professional assessment ethical standards in
assessing student learning, handling records, using or
communicating assessment results and making decisions.
3
Chapter 1
Assessment: Concept, Purpose, and Principles
Definitions
Test
A procedure in which a sample of an individual’s behavior is obtained,
evaluated and scored (AERA et al., 1999)
A process of presenting series of questions that student must answer
(Nitko, 1996) – achievement test
Measurement
A set of rules for assigning numbers to represent objects, traits,
attributes, or behaviors (Reynolds, 2006)
A process of quantifying or assigning a number to performance.
(Nitko, 1996).
5
Definitions
Assessment
Any systematic procedure for collecting information that can be used to
make inferences about the characteristics of people or objects (AERA et al.,
1999)
It is a general term that includes all the different ways teachers gather
information in their classroom(Nitko 1996; Airasian, 1996).
Evaluation
The process of making judgment about pupil’s performance, instruction, or
classroom climate (Airasian, 1996)
6
Frames of Reference for Evaluating
Assessment Results
1.Criterion-referenced-Criteria Vs Performance
 In Criterion-referenced interpretation is provided by describing
what the student can and can not do.
2. Norm referenced-Ones Performance Vs others
 In Norm-referenced, interpretation is provided by comparing the
student’s performance to the performance of others or to the
typical performance for that student
3. Growth referenced-Present Vs Past Performance
 in Growth-referenced, the present performance of a student
compared to his /her prior performance
4. Ability-referenced- Performance Vs Potential
 In ability-referenced a student's performance is interpreted in
light of that student's maximum performance(potential ).
Purposes/functions of educational
assessment
What do you think are the purposes of assessment?
Purposes/functions ...
Instructional Entering Behaviors
Objectives
Instructional
Procedures
Feedback loop
Figure 1. Basic Teaching Model
Performance
Assessment
Purposes/functions ...
Instructional Functions

Helpful to determine what to teach, how to teach it, and how
effective instruction had been

Tests encourage clarification of meaningful objectives

Tests provide feedback to the teacher and to the learner

Properly constructed tests can motivate learning

Tests can facilitate learning

Tests are useful meanses of overlearning
Purposes/functions ...
Administrative Functions
 Tests provide a mechanism of ‘quality’ control – policy decisions on
curriculum and instructional practices
 Tests are useful for placement decisions – assigning individuals to
various categories that represent different educational tracks or
levels ordered in some way (remedial, regular, honors)
 Tests are useful for classification decisions – assigning individuals
to different categories that are not ordered in any way (learning
disabled, emotionally disturbed etc
Purposes/functions ...
Administrative Functions
 Tests are useful for selection decisions - to determine
those who are/are not likely to succeed in subsequent
learning tasks
 Tests are useful for accreditation/certification. They are
useful to provide formal credit for demonstrated
knowledge /proficiency/.
Purposes/functions ...
Counseling and guidance functions
 To provide information that promotes self understanding and
help students plan for the future – to select careers that best
match to a student’s abilities and interests
Types of Assessment on the Bases
of Purpose
 Preliminary/Prognosis/-before instruction/during the
first days of school and provide a base for expectation
thought of the school year
 concerned with student’s skills, attitudes, and physical
characteristics and are essential to guide our
interactions with others and with student.
 Formative -During Instruction
 Based primarily on continuous informal assessments
such as Oral questions , observation. It also based on
formally developed assessment such as quizzes,
seatwork and homework.
 The purpose is to
 know whether or not students have achieved sufficient
mastery of skills and whether further instruction over
these skills is appropriate.
 to determine what adjustments to instruction should
be made.
Cont…
 Types of Formative Assessment
 Observations during in-class activities of students, non




verbal feedback during lecture
Homework exercises as review for exams and class
discussions
Question and answer sessions, both formal and informal
Conferences between the instructor and student at various
points in the semester
In-class activities where students informally present their
results
Student feedback collected by periodically answering
specific question about the instruction and their selfevaluation of performance and progress
 Summative-After instruction
 Based primarily on formally developed assessments like quizzes,
tests, project works, term papers, lab works
 Purpose is to
 To certify student achievement and assign end-of-term grades
 For promoting and sometimes grouping students
 To determine whether teaching procedures should be changed
before the next school year.
Types of Summative Assessment techniques
 Examinations (major, high-stakes exams)
 Final examination (a truly summative assessment)
 Term papers (drafts submitted throughout the semester would
be a formative assessment)
 Projects (project phases submitted at various completion points
could be formatively assessed)
 Portfolios (could also be assessed during its development as a
formative assessment)
 Diagnostic-before or, more typically, during
instruction.
 When it is implemented
 before instruction it is used to anticipate conditions
that will negatively affect learning.
 during instruction, it is used to establish underlying
causes for a student failing to learn a skill/ recurrent
learning difficulties.
Assumptions and principles of educational assessment
 Psychological and educational constructs exist
 Construct is a trait or characteristic that a test is designed to measure (e.g.,
achievement)
 Psychological and educational constructs can be measured
 According to Cronbach (1990) “if a thing exists, it exists in some amount. If
it exists in some amount, it can be measured.” assessment experts believe
that educational and psychological constructs can be measured
 Although we can measure constructs, our measurement is not perfect
 Some degree of measurement is inherent in all measurement.
Cont…
 The assessment of student learning begins with
educational values
Educational values should drive not only what we choose to
assess but also how we do so.
 Assessment is most effective when it reflects an
understanding of learning as multidimensional,
integrated, and revealed in performance over time.
it involves not only knowledge and abilities but values,
attitudes, and habits of mind that affect both academic
success and performance beyond the classroom.
 Assessment works best when the programs it seeks to
improve have clear, explicitly stated purposes.
 Assessment is a goal-oriented process. It entails comparing
educational performance with educational purposes and
expectations
Cont…
 Assessment works best when it is ongoing not
episodic.
 Assessment is a process whose power is cumulative.
Though isolated, "one-shot" assessment can be better
than none,
 Assessment serves as a means to gather
information to make decisions – not an end in
itself.
 Through
assessment,
teachers
meet
responsibilities to students and to the public
Cont…
 Basic Assumptions of Assessment
 The quality of students’ learning is directly, although not





exclusively, related to the quality of teaching.
To improve their effectiveness, teachers need first to make their
goals and objectives explicit and then to get specific
To make decision about students’ learning achievement, use of
various classroom assessment is vital
Teachers should understand that almost all assessment
techniques have their own weaknesses and strengths.
The use of a combination of assessment techniques increases
the validity and reliability of the data obtained.
To improve their learning, students need to receive appropriate
and focused feedback early and often; they also need to learn
how to assess their own learning.
Cont…
 Systematic inquiry and intellectual challenge are
powerful sources of motivation, growth, and renewal
for teachers, and classroom assessment can provide
such challenge.
 Classroom assessment does not require specialized
training; it can be carried out by dedicated teachers
from all disciplines
 By collaborating with colleagues and actively involving
students in classroom assessment efforts, teachers
(and students) enhance learning and personal
satisfaction
Cont…

Clarification of what to assess/evaluate must be given priority in the
evaluation process

An assessment/evaluation procedures must be selected because of its
relevance to the identified characteristic or behavior

There are different ways to measure any given construct.
Comprehensive assessment/evaluation requires a variety of
techniques of evaluation. No single mechanism is adequate to
appraise the learners’ progress toward all of the important learning
outcomes
Assumptions and principles of educational assessment

Proper utilization of assessment/evaluation mechanisms
requires an awareness of their limitations

Assessment/evaluation is a means to an end and not an end in
itself. The results obtained from an evaluation procedure should
lead to various sorts of educational decisions.
Continuous assessment
 It is the daily process by which teachers gather information
about learners’ progress in achieving the learning targets
 It makes use of formal/structured (test, exam, assignment etc.)
and informal/less structured (observation, oral questioning etc.)
mechanisms of assessment
 It is meant to be integrated with teaching in order to improve
learning and to help shape and direct the teaching-learning
process.
Continuous assessment
 The assessment is continuous because:
• it occurs at various times as a part of instruction,
• may occur following a lesson,
• usually occurs following a topic and
• frequently occurs following a theme.
Why continuous assessment? Benefits
 It provides regular information about teaching, learning and the
achievement of learning objectives and competencies.
 It also allows teachers to assess, in a classroom environment,
performance-based activities that cannot or are difficult to
assess in an examination (project works, model development
etc. ).
 It is also a powerful diagnostic tool that enables pupils
to understand the areas in which they are having
difficulty and to concentrate their efforts in those
areas.
 It also allows teachers to evaluate the effectiveness of
their teaching strategies.
The role of educational objectives in
assessment
Objectives /instructional objectives / educational
objectives
Relevance refers to whether the objective is based on the need of the
society and the learner
feasibility (realism of objectives) refers to whether the objectives are
achievable or not
 In terms of content
1. Objectives
should be appropriate for level of difficulty and prior learning
experiences.
2. Objectives should be “real” in a sense that they describe behaviors that
the teacher actually intends to act on in the classroom situation.
Cont…
 represent what we hope students will learn or
accomplish
 Objectives are stated desirable outcomes of education.
 They give direction for education
 They help teachers to plan instruction, guide students
learning and provide criteria for evaluating learning
outcomes
Cont…
Making specific and measurable in terms of attitude and
apperception
Example
Poor: Students will learn to love science.
Better: After a visit to a local hospital, pharmacy students will
better appreciate the importance of scientific
experimentation.
 objective should describe the overt behavior expected and
the content
Example
Poor: Students will know the world capitals
Better: Students will be able to recall the capitals of the
countries in eastern Africa.
Cont…
 In terms of form
 objectives should be stated in the form of expected student
behavior not in terms if the teacher’s activities.
Example
 Poor: The teacher will describe the major events in the
Ethio– Italian war.
 Better: The student will recall the military event that
directly leads to the outbreak of war between Ethiopia and
Italy
Poor
“the teacher will show the students how to solve
quadratic equations"
Better "the student will be able to solve quadratic
Cont…
Objectives should be stated in behavioral or performance
terms
Example
Poor: The student will see the importance of education
(implicit).
Better: The student will be able to identify three major
importance of education (instructional objectives
Objectives can be general or specific
 General objectives are broader in scope.
 They do not explicitly indicate what a student will be able to
do.
Begin each general objective with a verb (e.g., knows,
applies, interprets);
Example :
The student will be able to understand Newton’s second law
Cont…
 Specific objectives explicitly indicate what a student will be
able to do
 clearly express our instructional intent;
 precisely specify the students’ performance we are willing
to accept as evidence that general objectives has been
attained;
 select appropriate assessment techniques.
The student will be able to state Newton’s second law
2.
Don't state instructional objectives in terms of the
learning process
Example

Poor
"the student will study a diagram showing
human circulatory system"

Better
"the student will identify the parts of human
circulatory system"
3.
Don't include two objectives in one statement
Example

Poor
"the student will be able to list and describe
the fundamental causes of World War II"

Better
"the student will be able to describe the
fundamental causes of World War II"
4.
Specific objectives should be directly relevant to the general
objective from which they are derived. For example consider
the following general objectives
i. Students will know basic terms….
“ the students will write the textbook definition of each term"
ii. Students will understand basic terms
“the students will paraphrase the definition of basic terms in
their own words "
A = The audience to whom the objective is written. It should
be referred as the learner or the student not as the learners or
the students.
B = The Behavior or the type of change the learner is
expected to acquire. This should be an overt, observable
behavior.
C = The condition under which the behavior will be
demonstrated.
D = The degree of proficiency or the amount of learning
behavior the learner should display
Bloom’s Taxonomy of objectives
 Three domains
The cognitive domain – emphasis on understandings, awareness,
insights.
The affective domain – emphasis on attitudes, appreciations, etc
The psychomotor domain – emphasis on practical skills
Each of these have
Knowledge
 Objectives at the knowledge level require the students to
remember or recall information such as facts terminology,
problem-solving strategies, and rules.
Example
 The student will be able to name each state capital
Define
Select
Identify
State
Outline
Recite
Recall
Match
List
Name
Comprehension
Objectives at this level require some level of understanding. Students are
expected to be able to translate, restate what has been read, see connections
or relationships among parts of a communication interpretation, or draw
conclusions or consequences from information (inference).
defend
summarize
predict
estimate
convert
distinguish
discriminate
explain
Infer
extend
paraphrase
Example
 the student will be able to explain how interest rates affect unemployment
Application
Objectives written at this level require the student to use previously acquired
information in a setting other than the one in which it was learned.
change
employ
organize
transfer
modify
compute
prepare
use
produce
solve
demonstrate
develop
relate
operate
Example
the student will be able to apply multiplication of double digits in applied
math problems
Analysis
Objectives written at the analysis level require the student to identify
logical errors (e.g. point out a contradiction or an erroneous
inference) or to differentiate among facts, opinions, assumptions,
hypothesis, and conclusions
break down
differentiate
illustrate
distinguish
subdivide
relate
point out
diagram
deduce
outline
separate out
Example
 The student will distinguish the different approaches to establishing
validity and illustrate their relationship to each other
Synthesis
Objectives written at the synthesis level require the student to
produce something unique or original.
compile
create
categorize
compose
rewrite
design
summarize
devise
formulate
Example
 Given a short story, the student will write a different but
plausible ending
Evaluation
Objectives written at this level require the student to form judgments
and make decisions about the value or worth of methods, ideas,
people, or products that have a specific purpose.
appraise
contrast
interpret
criticize
compare
justify
support
defend
conclude
validate
Example
 The student will judge the quality of validity evidence for a
specified assessment instrument
Affective
 Levels of affective domain from simple to complex
1. Receiving- one is expected to be aware of or to passively attend to
certain stimuli or phenomena.
listen
attend
share
look
notice
be aware
control
hear etc…
Example: The student will listen actively when the teacher
explains the difference between formative and summative
evaluation.
2. Responding- One is required to comply with given expectations
by attending or reacting to certain stimuli.
follow
play
Practice
participate
discuss
applaud
comply
obey
 Example: The student will submit the assignments on the
deadline
3. Valuing- Display behaviour consistent with a single belief
or attitude in situations where one is neither forced nor
asked to comply.
help
express
act
argue
display
debate
organize
convince
prefer
Example: The student will express his support or
opposition on the nation’s stand against religious
fundamentalism.
4. Organisation- Commitment to a set of values. This level
involves 1) forming a reason why one values certain things
and not others, and 2) making appropriate choices
between things that are and are not valued.
select
formulate
balance
abstract
decide
systematize
compare
define
 Example: By the end of the class, the student will
reconcile his spiritual and economic views on helping
beggars
5.
Characterization- All behaviour displayed is
consistent with one’s value system. one has developed
a consistent philosophy of life
display
manage
exhibit
require
internalize avoid
resolve
revise
resist
Example: During peer evaluation time, the student
will evaluate his team mates objectively.
Psychomotor Domain
 Levels of psychomotor domain from simple to complex
1. Imitation-The learner observes and then imitates an
action. These behaviours may be crude and imperfect.
repeat
hold
follow
place
grasp
balance
Example: The student will assemble the mobile phone
after observing the technician’s demonstration.
2. Manipulation-Performance of an action with written or
verbal directions but without a visual model or direct
observation.
Example: The student will assemble the mobile phone
listening to the instruction given by the technician
3.
Precision-Requires performance of some action
independent of either written instructions or a visual
model.
 Accurately
with control
proficiently
Independently
without error
with balance
Example: The student will be able to assemble the
mobile phone at least 3 times appropriately given 5
chances.
4. Articulation-Requires the display of coordination of a
series of related acts by establishing the appropriate
sequence and performing the acts accurately, with control
as well as with speed and timing.
 harmony
speed
confidence
coordination
proportion
integration timing
stability
smooth mass
Example: The student will be able to assemble the phone
perfectly in less than 5 minutes.
5.
Naturalization-High level of proficiency is
necessary. The behaviour is performed with the least
expenditure of energy, becomes routine, automatic,
and spontaneous.
naturally effortlessly
professionally
routinely with ease
automatically
with perfection
spontaneously
UNIT TWO
 ASSESSMENT STRATEGIES, METHODS AND
TOOLS








2.1 Assessment strategies include:
Quizzes, Tests, Examinations
Anecdotal Records
Interview
Teacher observation
Performance Task
Exhibitions/Demonstrations
Checklists, Scales Or Charts




Classroom Presentations
Diagnostic Inventories
Peer Evaluation
Self-Evaluation




Portfolios
Rubrics
Simulation
Students Journal
 Student-led Conferences
 Quizzes, tests, examinations
 A quiz, test, or examination requires students to
respond to prompts in order to demonstrate their
knowledge (orally or in writing) or their skills
(e.g., through performance).
 Quizzes are usually short; examinations are
usually longer.
 anecdotal records: objective narrative records of student
performances, strengths, needs, progress and negative/positive
behavior
 Interviews
• An interview is a face-to-face conversation in which teacher
and student use inquiry to share their knowledge and
understanding of a topic or problem, and can be used by the
teacher to explore the student’s thinking; assess the student’s
level of understanding of a concept or procedure; and gather
information, obtain clarification, determine positions, and
probe for motivations.
• Teacher observations: regular, first-hand observations of
students, documented by the teacher
 Performance tasks
 During a performance task, students create, produce, perform, or present
works on "real world" issues. The performance task may be used to assess a
skill or proficiency, and provides useful information on the process as well
as the product.
 Exhibitions/Demonstrations
• An exhibition/demonstration is a performance in a public setting, during
which a student explains and applies a process, procedure, etc., in concrete
ways to show individual achievement of specific skills and knowledge
• checklists, scales or charts: identification and recording of students'
achievement can be through rubric levels, letter grade or numerical value,
or simply by acceptable/unacceptable
 Classroom presentations
 A classroom presentation is an assessment strategy that
requires students to verbalize their knowledge, select and
present samples of finished work, and organize their thoughts
about a topic in order to present a summary of their learning. It
may provide the basis for assessment upon completion of a
student’s project or essay.
 Diagnostic inventories: student responses to a series of
questions or statements in any field, either verbally or in
writing. These responses may indicate an ability or interest in a
particular field.
 Peer evaluation: assessment by students about one another's
performance relative to stated criteria and program outcomes
 self-evaluations: student reflections about her/his own
achievements and needs relative to program goals
 portfolios: collections of student work that exhibit the
students' efforts, progress and achievements in one or more
areas
 rubrics: a set of guidelines for measuring
achievement. Rubrics should state the learning
outcome(s) with clear performance criteria and a
rating scale or checklist. Using one assessment for
a multitude of purposes is like using a hammer for
everything from brain surgery to pile driving.
 Simulations: the use of problem-solving, decision-making and
role-playing tasks.
 Diary/ student journals: personal records of, and responses to
activities, experiences, strengths, interests and needs
 Student-led conferences: where the student plans,
implements,
conducts
and
evaluates
a
conference
regarding their learning achievements. The purpose of the
conference is to provide a forum in which students can
talk about their school work with parents/carers and
demonstrate their growth towards being self-directed
lifelong learners.
2.2. Planning for Assessment tools
(planning preparing an achievement test )
Steps involved
1.
Defining the purpose of the test
The purpose of the test will determine the kind of test to be used. This in
return will determine the score reporting and interpretation, breadth and
depth of the test coverage, item difficulty, item size etc.
2. Preparing table of specification (test blue print )
At this stage the number and types of items to be constructed are decided
based on the the stated instructional objectives and the content delivered.
3. Selecting appropriate item format
item format is selected considering the following
preconditions
a) The purpose of the test
b) The time available to prepare and score the test
c) The number of pupils to be tested
d) The physical facilities available for reproducing the test
e) Age and other characteristic of students
f) Your skill in constructing the different types of items.
4. Writing and piloting the initial draft of the test
At this stage, the teacher writes the test item, improve them
using comments from self and colleagues and try out it
on sample of students
2.3. Preparing Table of
specification
Preparing table of specification involves:
1. Listing down all specific instructional objectives
treated in the class
2. Listing down all content areas to be covered in the
class
3. Preparing a two-way grid/chart that depicts how
many questions are to be tapped from each content
or objective is listed down.
 When one to decide on the relative distribution of
question for each content and objective area, he or she
must consider:
1. amount of content contained
2. amount of instructional time devoted
3. roles as a future prerequisite
4. other opportunities to evaluate
Table of specification developed for General Psychology
Final Exam taken out of 60%
Contents
Perception
Memory
Learning
Emotion
Motivation
Personality
Total
Objectives
Know. Comp.
2
1
2
2
5
1
4
1
3
8
13
App.
2
3
10
Anal.
7
6
28
2
2
8
Synth.
1
4
2
3
Eval. Total
5
6
21
1
15
12
60
 The use of Table of specification
 Generally, the use of table of specification or test blue
print in test development will help ensure
 that only these objectives actually pursued in
instruction will be measured
 that each objective will receive the appropriate relative
emphasis in the test.
 that by using subdivisions based on content and
behaviours, no important objectives will be overlooked
or misrepresented.
2.4 Selecting and developing
assessment methods and tools
2.4.1. Assessment made in the course of teaching
1. Class Work And Homework
 Class works: are tasks that are given during learning
teaching process
 Homeworks are tasks assigned to students by their
teachers to be completed outside of class
2. Observation
 Refers to watching the learner while performing the
necessary skills(performance tasks )
 Observational techniques / tools include:
 Anecdotal records ,
 Checklist
 Rating scale ,
 Socio-metric techniques
 Anecdotal recodes:
 Anecdotal recodes provide the least structured
method of recording behavioral observation.
 It is merely a brief description of some observed
behavior which appeared significant for evaluation
purpose.
 Checklist
 Checklist is a prepared list of statements relating to
behavior, trait and performance in some area or a
product of some performance. Each statement in the
list is checked in some way to indicate presence or
absence of a particular quality. It is frequently used to
evaluate aspects of pupil’s interests, attitudes,
activities, skills, and personal characteristics.
A student’s class participation
cheecklist
Class participation
Listening the lecture
Answering easy questions
Answering all questions
Answering difficult
questions
Taking lecture notes
Leading group
discussion……..
Yes
No
 Rating Scale
 It is a device for systematically recording observers
judgment concerning the degree to which a quality or
trait is presented.
Rating scale for a student’s class
participation
Class participation
Listening the lecture
Answering easy
questions
Answering all
questions
Answering difficult
questions
Taking lecture notes
Leading group
discussions …….
Never
seen
1
Some
times
2
Usually
3
always
4
 Socio-Metric Technique
 Socio-Metric Technique is a method for evaluating the
social relationships existing in a group. Each group
member is asked to indicate those individuals they
would prefer as associate for some group situation or
activities. The number of choices each pupil receives
serves as an indicator of his/her social acceptance.
Strengths and weaknesses of
observation
 Strengths
 It enables skills to be seen live
 It enables mistakes to be easily identified so learners
can learn more
 It is reliable since evidence has been seen/first hand
information is best gathered through observation
 Weaknesses
 Timing has to be arranged to suit all learners/ it is time
taking .
 Assessor might not be objectives with decisions
2.4.2. Periodic Assessment method
Test /examination
Types of tests
Tests can be categorized in different ways based on different
criteria
1 : the Kind of answer required
On the basis of this criterion tests are classified as
Selection test item and Supply test item/
constructed response tests
 Selection type item /Selected-response Item Format/
requires student to select the correct answer from
the given alternatives e.g Multiple Choice ,T/F,
Matching
 Supply type item/ Constructed Response Item
Format/ requires student to write the answer for the
item. e.g. short answer, completion , essay
1.
 2: The nature of item scoring
 Objective and subjective
 Objective test items have clear and unambiguous
criteria to score items. because of this, different
scorers can give the same result to students’
answer
 Subjective test items are scored differently by
different scorers because of unclear and
ambiguous criteria to score items
3. Degree of standardization
 Standardized and non standardized
 Standardized tests
 Are constructed by test specialists with
curriculum experts and teachers
 Can be used to compare the performance of one
school students with other schools
 It is administered at the same time in different
place e.g. General school leaving Examination
 Non standardized tests / teacher made tests
 Is constructed by classroom teachers
 Used to gather information about the progress of
students in the classroom
 It is administered at any time
 Can not be used to compare students of different
schools
 4: Number of testees
 Individual and group test
 Individual test
 Is designed to be administered to one person at a
time.
 It requires much training ,time ,money, and
experience to administer than group test
 Group test
 Is administered to a group of persons at the same
time and place
 5: Nature of responses required by the item/







language emphasis of the response /
verbal , non verbal and performance tasks
Verbal : mostly use written language to respond
items
Non verbal :
De-emphasis the role of reading and writing
language
The responses of the items presented in the form
of pictures, figures ,musical records etc
Performance tasks
Requires students to perform a task rather than
to answer questions.
6. The speed of students to complete the test items
Speed and power
 Speed :
 Requires student to complete the items as fast as
possible within short period of time.
 The items are easy
 Here only the most exceptional students win the
competition
 Power test
 Has enough time to complete the exam
 It focuses on the amount of knowledge , and
comprehension students possess rather than
speed
 In comparison to speed tests power tests are
difficult.
7. Scheme of interpretation of results
 Norm referenced test and Criterion referenced
test
8. Attributes / behavior being measured
 Cognitive test and Non cognitive tests
Cognitive tests
 used to measure intelligence , reasoning ability ,
and academic achievement.
 it has correct or best answer(Achievement test
and aptitude test are included under cognitive
tests
 Non cognitive tests
 Used to Measure non academic or affective
behaviors
 eg. personality test like emotional adjustment
tests ,tests used to measure interpersonal
relationship , motivation, interest attitude etc
9. In terms of content
Survey and Mastery
 Survey test seeks a broad approximation of students
achievement by measuring attainment of a sample of
the objectives in one or more levels of a curriculum.
 Mastery test is usually employed to get more detailed
information about student’s achievement over a short
range of objectives
 Selected Response Item Format
True/False , matching, multiple choice & interpretive
exercise
Advantage
 reduces marking time of students response.
 Speedy assessments
 Wider sampling of content areas
 Provision of automatic feedback to students particularly
when used in computer-based assessment.
 Questions can be pre-tested in order to evaluate their
effectiveness.
Limitation
 Significant amount of time is required to construct
good questions
 Writing questions that test higher order skills requires
much effort as compared to constructed ones.
 Cannot easily and directly assess written expression,
creativity and performance.
 Problem of guessing
True /False/ Alternative Response Item Format
 A statement will be given & students express their agreement or
disagreement to the truthfulness/correctness/ of the statement by
choosing either of the two mutually exclusive options. The mutually
exclusive options can be given as:
True or False, Correct or
Incorrect, Yes or No, Right or Wrong, Valid or Invalid etc.
 May also be required to judge whether the converse
statement is also correct or not. Hence, the additional
options become Converse True or Converse False.
 Eg .
T
F
CT
C F. All Ethiopians are Africans.
 First the student has to judge whether the direct
statement ‘’All Ethiopians are Africans ‘’ is True/ False.
 Second, he/she is also required to judge whether the
converse statement is True i.e. CT or False i.e. CF by
mentally reversing the direct statement from ‘’All
Ethiopians
Ethiopians’’.
are
Africans’’
to
‘’All
Africans
are
Advantages
1. Are comfortable for young children or pupils.
2. It can cover a larger amount of subject matter in a
given testing period than can any other selected
response item format.
3. The problem of misspellings and lack of legibility or
neatness of students hand writings is not the issue of
this item format .
4. Are appropriate when there is lack of 3 or 4 plausible
(equally attractive) distracters in multiple choice.
 5. If carefully constructed, it measures the higher
mental processes of understanding, application, and
interpretation. For instance, this is an example
developed from application level in the cognitive
domain.
T F
1. If X+3X=9, then the value of X is 2.
 To answer the above item the student must regress
through the appropriate mathematical algorithm.
 Limitations
 1. Highly susceptible to guessing effect. ( 50%
guessing probability is provided in this item format)
 2. highly susceptible to cheating
 3. The problem of getting statements which are
unequivocally true or false.
 4. True-False items tend to rely heavily upon rote
memorization of isolated facts, thereby trivializing the
importance of understanding those facts.
 Guidelines for constructing True/False items
 1. Test significant contents of a course and avoid trivial
statements.
T F
Benjamin Bloom had Jewish blood origin.
 What is the significance of this item for educational
assessment and evaluation course?
2. Write items that can be classified unequivocally as
either true or false and if it is an opinion or arguable
theory recognise the source of theory or opinion. Look
at the following example:
 Poor : T
F. Females feeling of inferiority results from
their lack of penis.

Better : T F. According to Freud Females
feeling of inferiority results from their lack

of penis.
3. Avoid taking statements or verbatim directly from
textbook.
 Because
 It pushes students to engage on rote learning rather
than getting its gist.
 The context under which the verbatim or statement
used on the text (exercise) book may not be available
on the exam. So the exam becomes ambiguous.
 For example:
T
F Poor: The Square of the hypotenuse of a right
triangle equals the sum of the squares of the other two
sides.

T
F Better: If the hypotenuse of an isosceles
right triangle is 7 inches, each of the two equal sides
must be more than 5 inches.
4. Ask only a single major idea in each item. And avoid
compound statements unless the item measures
cause-effect relationship.
 Poor T F: Ethiopia is the oldest origin of human
civilization and currently it is among middle income
countries of the world.
 Better T F: Ethiopia is the origin of human civilization
 Better T F: Currently Ethiopia is grouped among
middle income countries of the world.
5. Avoid tricky questions. Tricky questions cheat students
 Eg . Writing an item in misspelled manner.
 Poor T
F: The largest sports kit producer is Addidos.
 Better T
F: The largest sports kit producer is Addidas.
6. Avoid using absolute degree indicator terms like “always,”
“all”, or “never’’ in False statement and relative degree
indicators terms like “usually,” “often,” “many” in True
statement.
 T F. All Americans are educated.
 T F. Most Americans are educated.
7. Avoid using negatively worded statements and if it is
obligatory write it in italics, bold or underline it.
T F Poor: Animals do not produce their own food.
T F Better: Animals produce their own food.
8. Put the items in a random and make the number of
true and false options approximately equal. This
minimizes the problem of response set.
9. Try to avoid long drawn-out statements or complex
sentences with many qualifiers. This will help students
to understand what is asked and reduce language
barrier.
10 . Make the length of True and False statements
approximately equal. Avoid making items that are
True consistently longer than those that are False.
Matching exercise
 A matching exercise typically consists of a list of
questions or premises to be answered along with a list
of responses.
 The examinee is required to make an association
between each question column (premise) and the
response column (alternatives).
 5.2.3 Advantages
 It is compact
 It measures large number of objectives stated at
knowledge level in a minimum amount time./ item
sampling is higher/

 5.2.4 Limitations
 It is generally restricted to the measurement of factual
information based on rote learning.
 Difficult to find homogenous material.
Guidelines for Constructing Matching Exercise
1. Use homogeneous material in each list of a matching
exercise.
Column A
Column B
1. Abebe Bikila
A. Atlanta
2. Fatuma Roba
B. Barcelona
3. Mirutse Yefitir
C. Gura
4. Mamo Wolde
D. Mexico
5. Derartu Tulu
E. Moscow
6. Ras Alula Abba Negga F. Munich
2. . Include directions that clearly state the basis for the matching.
3. Keep the lists of items to be matched unequal.
4. Put the questions or the premises (typically longer than the
responses) in a numbered column at the left, and the response
choices in a lettered column at the right.
4. Arrange the list of responses in alphabetical or numerical order
if possible in order to save reading time. Look column B under
table 4. The names of the Olympic cities are arranged in their
respective alphabetical order of their initial letters
5. Limit the least of premises between 5 and 10. Furthermore,
put all the responses to be matched on the same page as it
helps to prevent the production of noise when students flip
back and forth the test booklet.
Multiple Choice
The multiple choice item (MCQ) consists of two distinct parts:
1. The first part that contains the task or problem is called stem of the item. The stem of the item
may be presented either as a question or as an incomplete statement. The form makes no
difference as long as it presents a clear and a specific problem to the examinee.
2. The Second part presents a series of options or alternatives. Each option represents possible
answer to the question. In a standard form one option is the correct or the best answer called
the keyed response and the others are misleads or foils called distracters. The number of
options used differs from one test to the other. An item must have at least three answer choices
to be classified as a multiple choice item. The typical pattern is to have four or five choices to
reduce the probability of guessing the answer. A good item should have all the presented
options look like probable/plausible/ answers at least to those examinees who do not know the
answer.
Multiple...
 Knowledge of terminology
 Knowledge of specific facts
 Knowledge of principles
 Knowledge of methods and procedures
 Advantages of Multiple Choice
Items
 It is flexible
 It avoids the problem of spelling error by students
 Multiple choice items have greater reliability per
item
 The need for homogenous material is minimized
or avoided
 It is relatively free response set
 Using a number of plausible alternatives makes
the results useful in diagnosing students’ learning
errors.
 Cont…
 It is free from the common weakness of other type
items
 Students should know the answer
 Example
T
F Africa union was established in Algeria.
Africa union was founded in _______.
A. South Africa
C) Ethiopia
B. Kenya
D) Algeria
 Limitations of Multiple Choice Items
1. It is limited to the measurement of verbal material
1.
2.
It is unsuitable to measure synthesis and evaluation levels of the
cognitive domain
Difficulty of getting plausible distracters
Suggestion for Constructing Multiple Choice Items
 a clearly stated problem
 the identification of plausible alternatives, and
 removing irrelevant clues to the answer
 The stem of the item should be meaningful by itself and
should present a definite problem.
 State the stem of the item in positive form wherever
possible
Cont…
 Put as much of the wording as possible in the stem of




the item
All the alternatives should be grammatically consistent
with the stem of the item.]
All items should contain only one correct or clearly
best answer.
Make the distracters plausible and attractive to
the uninformed
The correct answer should appear in each of the
alternative positions an approximately equal
number of times but in random order.
Cont…
 Use carefully “none of the above” and” all of the
above” as alternatives rather better to use A & B
or B & C
 Make certain each item is independent of the
other items in the test
 Avoid an intentional clue to the correct answer
 irrelevant clues
 specific determiners
 correct answers that are consistently longer
 Grammatical inconstancies between the stem and the
wrong alternatives tend to be easier than items
without these faults
Subjective Items Format
 Essay items enable students to select, organize,
interpret, and present ideas in their own ways.
 it provides more freedom to respond for the student
than objective test formats
A. Restricted response questions
 It limit both the content and the forms of pupils
response\
 Example
Write in a short paragraph the reasons multiple choice
items are widely used?
Cont…
 Extended Response Essay Questions
 It allows the student to determine the length and





complexity of response
Most useful at the synthesis or evaluation levels of
cognitive taxonomy.
To determining whether students can organize, integrate,
express, and evaluate information, ideas, or knowledge
Example
Evaluate the effects of globalization.
Identify as many different ways to generate electricity as
you can. Give the advantages and disadvantages of each
and how each might be used to meet the electrical
requirements of a medium – sized city.
Cont…
 Advantages of Essay Items
 Most effective in assessing complex learning
outcome
 Relatively easy to construct
 Emphasize essential communication skills in
complex academic disciplines
 Guessing is eliminated
Cont…
 Limitations of Essay tests
 Difficult to score
 Scores are unreliable
 The limited sampling they provide
 Bluffing:
 Suggestions for Constructing Essay Questions
 Restrict the use of essay questions to these learning outcomes
which cannot be satisfactorily measured by objective items
 Formulate questions that will call forth the behavior specified in
the learning outcomes
Cont..
 Phrase each question so that the pupils’ task is
clearly indicated
 Indicate an approximate time limit for each
question
 Avoid the use of optional questions
Suggestions for Scoring Essay Items
 Prepare an outline of the expected answer in advance
 Use the scoring methods which is most appropriate
Two formats
I. The point method and
II. The rating method.
Cont…
A. The point method
 The criteria may include, content, organization, word
selection,
accuracy/reasonableness,
completeness,
originality and so on
B. The Rating method
The teacher generally is more interested in the overall quality
of the answer than in specific points.
General recommendations for scoring essay item
 Decide on provisions for handling factors that are
irrelevant to the learning outcomes being measured
i.e hand writing, language ……..
 Evaluate all answers to one question before going on to the
next question
 Evaluate the answers without looking at the pupil’s name.
 If important decisions are to be based on the results, obtain
two or more independent ratings
 Assembling the Classroom Test
 Early preparation of exams
 Extra items make it easy to eliminate those items
found to be defective
i. Recording of Test Items
it is desirable to write each item on a separate index
 The card should contain information concerning

the instructional objectives,
 the specific learning outcome,
 space should also be reserved on the card for item
analysis information
Cont…
 ii. Review of Test Items
a. Reviewing the items after they have been set aside for a few
days, and
b. Asking a fellow teacher to review and criticize the items
iii. Arranging of Items in the Test
 the types of items used
 the learning outcomes measured
 the difficulty of the items and
 the subject matter measured
Cont…







Based on Test items
True-false
Matching items
Supply type ( short answer and completion )
Multiple-choice
Interpretive exercises
Essay questions
Learning outcomes
For example, the items in the multiple-choice section might be
arranged in the following order:
 knowledge of terms
 knowledge of specific facts
 knowledge of principles
 application of principles
Cont…
iv. Preparation of Directions for the Test
 purpose of the test
 time allowed for completing the test
 basis for answering
 procedure for recording the answers
v. Reproducing the Test
 The test items should be spaced and arranged
 it is desirable to proofread the entire test before it is
administered. Charts, graphs and other pictorial must
be checked
Cont…
 Administering of the Tests
 Adequate working space
 Quiet room
 Appropriate light
 Ventilated room
 Comfortable seat and so on.
Test anxiety
Some of the excessive test anxieties caused by:
 threatening pupils with tests,
 warning pupils to do their best “ because this test is important”
 telling pupils they must work fast to complete the test on time
 threatening terrible consequences if they fail the test.
Cont…
 Other psychological factors to be considered by the
teacher are:
 Time of testing, if tests are administered just before
“The big game” or “the big holiday”, the results may
not be representative.
 Individual pupil fatigue, the onset of illness, or worry
about a particular problem may prevent maximum
performance.
 Things we need to avoid during test administration





Do not talk unnecessarily before letting students start working
Keep interruptions during the test to minimum
Avoid giving hints to pupils who ask about individual items
Discourage cheating
Activities that do not match with test administrations
Item Analysis Procedures
 Steps of Item analysis
1. Arrange the scored test paper in order from the
highest to the lowest or in reverse order
2. From the arranged test papers, form two groups. That
is upper and lower groups
If students are 40 and below all are included in the
analysis if they are more than 40
Nx 27/100 upper group and Nx27/100 lower group
Difficulty level/p/ = Ru+RL
T
Cont...
 P levels are less than about 25%, the item is considered relatively
difficult
 When P levels are above 70%, the item is considered relatively
easy
 Test construction experts try to build test items that have most
items between P levels of 20% to 80% with an average P level of
50%.
Discrimination power of the item
 D= Ru-RL/1/2 T
 If the value of D
 is ≥ 0.40, the item is very good
 0.30 – 0.39, the item is reasonably good but subjected to
improvement
 0.20 – 0.29, the item is moderately good but needs revision

is < 0.20, the item is poor, needs serious revision or rejected
Cont….
Evaluating the effectiveness of distracters
 Distracters effectiveness is determined by inspection
or observation
 A good or effective distracter is the one that attracts
more students from the lower group than the upper
group
Group
Alternatives
A*
B
C
D
Upper (10)
5
4
0
1
Lower (10)
3
2
0
5
Cont…
 Option “B” is a poor distracter because it attracts more
pupils from the upper group
 Option “C” is completely ineffective as a distracter
because it attracted no one
 Alternative “D” is functioning as intended
INTERPRETATION OF SCORES
 Once test items are scored, the teacher should
organize their results and give meaning to the scores.
 A student’s score has no meaning by itself.
 It has meaning when compared with other students’
score or compared with a certain criterion
 In this part we will address some of the concepts of
basic descriptive statistics, such as measure of central
tendency, variability, and relationship.
Interpretation
 Table 1: Scores of 25 grade nine students on a mid
exam of mathematics out of 40 %
25
24
26
28
28
28
26
27
20
27
26
25
31
26
14
26
32
28
36
27
27
29
29
26
27
Cont…
 The following scores are geography final examination





results out of 60 for 40 students
30
17
38
46
29
20
16
39
26
29
50
27
36
29
30
51
27
36
22
31
16
28
28
18
32
39
44
49
56
13
22
21
24
27
39
12
10
12
10
20
The Use of Summary Statistics
 The statistics used to report the typical score is called




measure of central tendency.It includes mean, median,
and mode,
To describe the amount of score differences among
students use measure of variability. It includes range,
standard deviation, and variance
Use measures of relationships
Measures of Central Tendency
Describe points on a distribution that represents the
average or typical values.
Cont…
 The Mode
 Mode is defined as the score value which is obtained most
often. It is the most frequently occurring value.
 Example
The following are the scores of 10 students on a 25 item
spelling test.
12, 18, 16, 20, 13, 19, 19, 19, 20, 17
 The Median (Mdn)
 The median is the point that divides the number of ordered
or ranked scores in a distribution into equal parts.
 It is determined by arranging the scores in order of
magnitude and selecting the value that separates the score
in to equal parts.
Cont…
 Example
 To calculate the median score of the above score you should
arrange the scores like this:
12, 13, 16, 17, 18, 19, 19, 19, 20, 20
 The median of the above distribution, therefore, (since it is
even) is
18+19/2

Mdn = = 18.5
The Mean
 The mean is the average score, obtained by adding all the
scores and dividing the sum by the total number of scores.
Measures of Variability
 The Range
 The range is the difference between the highest score
and the lowest score in the distribution
 Standard Deviation
 The standard deviation indicates the average of the
distances of all the scores around the mean. It is the
most common and useful measure of variability.
 More formally, the standard deviation is the square
root of variance (S2).
Download