In this set of musings I am considering primarily the role of summative assessment as operationalised in day-to-day assessment in the classroom. I appreciate that it is formative assessment that makes a real difference to pupil learning, and I shall return to formative assessment (AfL) in a future post. For the moment, this is really me thinking out loud, and trying out an idea.
Back in 2010, in my OUP “Assessment in Music Education” book (http://amzn.to/1el9PON) I wrote this:
Reasons to assess
In earlier chapters we have looked at purposes of assessment, and considered differences between formative and summative purposes of assessment. To some extent, these can be used as the basis for answering the question ‘why assess’? But other areas also impinge upon this, with varying impact upon teachers and pupils, which means that reasons to assess include:
- To help pupils improve in what they are doing
- To measure what pupils have done
- To measure what pupils have learned
- To provide evidence for pupil progress
- To indicate how much effort a pupil has put into their work
- To ensure that standards rise
- To provide indications as to how well pupils will do in the future
- To provide information for parents concerning their children’s progress
- To check that specific topics have been learned
- For a teacher to evaluate the efficacy of a Unit of Work
- To measure how well pupils have been taught
- To provide statistical data
- To measure how effective a teacher is
- To compare the results from different schools
- To show that music is important in contributing to general education
- To compare different geographical areas in terms of educational outcome
- To evaluate the effectiveness of educational interventions
- To evaluate the efficacy of governmental strategies in education
This is a complex and disparate set of reasons for assessment! Many of these, and many more besides, will impinge at some point upon the work of the music teacher. What is also apparent from this list is that there are tensions between different uses of assessment.
I haven’t changed my mind on this. So when people ask me about assessing without levels, I want to ask them to what use will the assessment data be put? Because for me this is the key question. The list above is very detailed, what I tend to hear on a day-to-day basis from teachers is:
i) to record attainment
ii) to show progress
iii) to report to parents
iv) because I have to (derr!)
Now i-iii of these are worthy and logical, but I worry a bit about iv! But let us look at a the first of these for a while.
This can be done for a variety of purposes. I hope that amongst them is the notion of helping pupils improve at whatever it is that they are doing. So, what is attainment in music? It should involve a scaled judgement of quality, of how much of set of pre-defined assessment constructs exists in the work of a given pupil. This is no easy feat. It requires documenting, writing assessment criteria, giving these a clear grade boundary, and being able to accord a differentiated mark in the heat of classroom activity.
Let us take this apart a bit. Suppose that KS3 pupils are doing a unit on “spooky music”. The task has been to compose a piece in groups, and perform it to the rest of the class. The piece should be ‘spooky’ in some way, and the pupils have been asked to provide a programme note. So, what should be assessed? Here are some ideas:
Original ideas Organisation of ideas Development of ideas
Structure Intentionality Instrumental skills
Atmospheric effectiveness Group work
Musicality Use of instruments Understanding of topic
Clarity of performance in realising intentions Programme note
…and lots more besides!
But let us suppose that time is not a problem (I know, don’t start throwing things at me!), and that an enterprising teacher turns each of these into an assessment criterion. Let us also suppose that the teacher decides to use that old research favourite, the 5-point Likert scale, to differentiate. So, each construct in the list above will now figure on an assessment grid, with a 5-point marking scale. To save some time, rather than construct statements for each, I am going to use generic 5 point scalene statements. These are:
- Well above expected standard
- Above expected standard
- Expected standard
- Working towards expected standard
- Below expected standard
Then I assemble them into some sort of a grid, so I can mark them off:
|Organisation of ideas|
|Development of ideas|
|Use of instruments|
|Understanding of topic|
|Clarity of performance in realising intentions|
Or, if I want to involve the pupils, I can use 5-point smiley-icons. Or I could, should I so wish, add some scores, say 1-5 for each column.
Now, having done this, one for each child, what would this tell me? It would record attainment, sure. It would be criterion-referenced assessment, and, so long as I was consistent in applying the rating scale, it would show relative attainment of each pupil in the cohort. But it would not show the pupils how to improve; that would be the role of the teacher. Nor would it show progression, I’ve only done one unit after all! But it could – if I use this ‘on the fly’ during teaching – show within-unit progression. If I use it to ‘catch the pupils being good’ I needn’t wait until a final (dreadful behaviour-management nightmare) assessment lesson.
If I had time, I could talk to the pupils, and add a criterion for how they revealed their thinking about the music in conversation.
If I then did this for all my units during a year, I could record attainment across a range of units. I would want to show progress through the curriculum, in that expectations inherent in the intended learning statements developed during the year. This would, for me, answer one of the bugbears of NC level assessment. Progress in some schools can only be upwards, and linear. I want pupils to score differently on different units. If we’re doing the Viennese Waltz I don’t want the to have to score more than they did in the last unit, which was songwriting, and they enjoyed!
So, is this unrealistic? Maybe. Is it too much work? Probably. But what it does have in it favour is not one set of generic attainment statements which simply don’t fit Spooky music, the Viennese Waltz, and songwriting.
Will I change my mind on this? Probably! But, here are some of my criteria for assessments, also from the OUP book:
A criterion should have a degree of exclusivity
- It should be specific enough to measure a single item/skill/construct without too many extraneous variables coming into play
- A criterion should, if possible, relate to a singularity
- A criterion should be assessable in some way
- It should be possible to ascribe a rough valuing to the criterion, along the lines of the example above …
- A criterion should have some relationship to the whole
- It should not be evaluating an irrelevant aspect of musical accomplishment, such as one observed which was ‘has tie done up’!
- A series of criteria which deconstruct a whole should, when taken together, go some way towards formation of an overall impression of the whole
- The isolated deconstructed aspects of criteria should not simply be an amorphous mass of unrelated trivia, but should have an overall meaning.
- Just because something is hard to assess does not mean it should be ignored!
- The example of a criterion looking at musical results…is an example of this. It is probably the most important aspect, and so should be assessed in some way. It does rely on professional judgements, but so does neurosurgery!
To these I would add:
- Assessments should not have to show linearity of progression – it should be possible to obtain differentiated results
- Assessment should be context-specific
- The purpose of the summative assessment should be clear to all concerned
- Intended learning statements, written well, can become their own assessment criterion. Pupils need teaching, as well as assessing
- Progression is shown not just in assessment, but in increased challenge in curriculum
- The curriculum should be cumulative, but assessment should be focussed, not vice versa (ie a “Well above expected standard” means something different in Y7 to Y9)
What this does mean is that context-specific assessments cannot inherently show progression, a different tool is needed for that. And that is a topic I shall look at in a future posting.
So, as I said at the outset, I am bound to return to this over the coming months, and I crave your indulgence in sharing some of that thinking here.