4

4. Tool Evaluation

Conference Management

Sdr

As mentioned in chapter 3, part of sdr’s interface (mainly words appearing in the interface and in the online help system) was redesigned using a language based approach to conceptual design. In this section the evaluation of the new interface is described. The changes to sdr’s interface were made as part of a student project on conceptual design, and sdfg sdthe emphasis in the evaluation was therefore on whether users would form correct users’ models of sdr or not, rather than on traditional hci usability issues.

The results from the evaluation showed that all users who were introduced to the redesigned version of sdr, were able to use the tool competently after a short introduction and one week’s unsupervised practice. Furthermore, more than half the users used language belonging to the sphere of the design model to articulate their knowledge about sdr and 5 were deemed to have correct users’ models based on the design model. These results are encouraging as they are based on a redesigned version of sdr where only labels were changed to reflect the new design model.

In order to detect if users of sdr had correct users’ models, based on the "Electronic Radio Times design model", linguistic evidence had to be identified, as correct completion of tasks does not necessarily mean that the user has a correct user’s model. In other words, the user might do the right thing for the wrong reasons. Having a correct user’s model, i.e. doing the right thing for the right reason, is important in situations like error recovery etc. Possession of a correct user’s model is a prerequisite for effective use of a system. Sasse (1996) has reviewed empirical work on users’ models and concluded that performance results alone may not be reliable indicators of users’ models, and strongly recommends using verbal protocols in addition. It was therefore necessary to look for ways of making the users verbalise their thought processes in a natural way. One way of doing this is to have the user to teach someone else about the application (Miyake, 1986). But first, the users were introduced to sdr to give them time to consolidate a user’s model. In the following section the training and evaluation procedure will be described.

Methods for Eliciting Mental Models

In order to compare users’ models, both the existing and the new interface were studied. The new user interface was evaluated with 12 users who had never used sdr before (new users), and the existing user interface was tested with 12 users who had been using the original version of sdr (existing users). The reason for evaluating the original as well as the new interface was to provide a control group that we could compare with when eliciting the new users’ models. The study was divided into three parts:

Task completion. Users were asked to complete six tasks while thinking aloud. They were told that we would prefer them work out how to do the tasks themselves, but if they got irreversibly stuck, they could ask for help. The tasks were scored, based on whether the subjects had successfully completed the tasks without help or not, and problems that the users had completing the tasks were noted.

Mindmaps. Users were given paper copies of the four main windows of sdr, a large piece of paper, a pen and some glue, and asked to glue the windows onto the piece of paper and draw arrows from one window to another if they thought they could get from one window to the other in sdr. The arrows from the mindmaps were listed in tables and added up to see if there were any differences in the mindmaps of the new users and old users.

Teach-back. Users were asked to teach sdr to a contrived co-learner, whom they were told was new to sdr. In fact, the co-learner knew sdr well and prompted users to explain sdr functionality and behaviour of the user interface. The teach-back sessions were transcribed to supply data in which to look for linguistic evidence of users’ models. As mentioned earlier, words are linked together in a semantic network, i.e. words which are closely related will tend to be present at the same time. When looking for evidence of "Electronic Radio Times" based users’ models, not only the actual words "Radio Times" and "Daily Listings" were relevant, but also words closely related to the entire concept of TV and broadcasting.

Existing users performed all three parts in one session. New users did the first part one week and the second and third the following week. The first part the tasks was performed as a practice session for the new users, but they were also asked to use sdr in the week between the first and the second session to familiarise themselves with it. Part one and three were recorded on videotapes and transcribed. The videotapes contain an overlay of two images. One is a frontal image of the users, recorded with a video camera next to the workstation. A camera is an integral part of a multimedia workstation and is a necessary accessory when using sdr and it should therefore not be extraordinarily intrusive to the user. The other image was the screen the users were looking at. By overlaying these two images, and recording them onto a videotape, it is possible to see and hear the user as well as see what the user is doing on the screen, all at the same time.

Results

The use of language differed considerably between the two groups: Half of the "new" users explicitly used the "Radio Times" metaphor to explain certain features of sdr, but even "new" users who did not explicitly mention the "Radio Times" used language belonging to the "Electronic Radio Times" design model. They referred to different "stations" or that sessions are "on" etc. "Existing" users state that sessions will be "active". See Table 1 for linguistic examples from the teach-back sessions.

"New" users about sdr:

"Existing" users about sdr:

sdr is a multimedia tool which enables you basically to network between various persons and stations throughout the Internet... Doesn’t mention the Radio Times, but talks about stations which is definitely a word from the TV domain.

The sdr as for what I understand it actually, is a session directory, actually you use it to find out, it’s like a TV guide, Radio Times, something like that, and within you can find out what sort of programs in quotes are being run every moment on the multicast backbone of the Internet. Now I have explained everything [laughs]

sdr is like a Radio Times

About the Calendar/Daily Listings Window:

the sort of Radio Times in that it tells you what’s on when

If you wanna check what’s on This is a typical utterance about the Radio Times.

From there you can go and see what else is on...Not this one [main session window] but the Radio Times sort of, daily listings

This [points to the daily listings button] is where you find out what’s going on. That’s the real Radio Times, if you will, so you can click on that and you’ll get a window.

the TV guide, I call it

This is the tool that is listing the announcements and the advertised sessions and it also creates the announcements and the advertised sessions.

About the calendar/daily listings:

Now, if you want to see which ones are active at the time... right now. You can click on calendar

Well, what you can look at, if you press the calendar button ... and that gives you a little calendar...and if you press on today’s date... and it gives you a little view of what’s going on today. ‘Going on’ is not a RT term.
At the very end of the session Anne is asking, ‘What are these ones up here [pointing to the top row of buttons on the main window]?’ Subject answers, Well, if you click on calendar, you’ll get the calendar come up...that we had before [the only time the calendar has been opened in this session was from the create session window to see if Anne’s session would clash with any other sessions]...it provides you with timelines for each session...so you can see which ones clashes with another, so it gives you the sessions for each day and when...

About the Main Window:

So everything you can see in the window here at the moment are sessions that are currently being advertised and may or may not be taken part around the world.

and this gives you a list of sessions, they are called sessions...that people are currently advertising, and that doesn’t mean that they are running at the moment, but they are advertising either something that is running at the moment or something they are planning to do in the future. They are announcing their intention. No "Radio Times" terms.

Table 1: Statements by "new" and "existing" users about sdr

"Electronic Radio Times" based user models

We have now seen that there were considerable differences in the language that "existing" and "new" users use to explain certain features of sdr. However, we set out to discover whether "new" users would have an "Electronic Radio Times" based user model of sdr. Results indicate that 8 of the "new" users had few or no problems teaching sdr to the co-learner; and of these 5 had user models clearly based on the "Electronic Radio Times" design model. I suspect that the number could have been 7, had it not been for a software bug in the Daily Listings Window which caused some sessions not to appear in the Daily Listings Window, despite the fact that they were ‘on’ ¾ this appears to have disturbed the construction of their user models, particularly because the sessions that did not appear in the Daily Listings Window were ‘exciting’ ones like the Nasa shuttle launches and a Canadian News TV station.

The transcripts from the teach-back sessions for "Electronic Radio Times" based user models were analysed. All users successfully taught the co-learner all main tasks for sdr. However, they all encountered problems at some time during the session. These problems were divided into major and minor problems. The users who successfully taught all tasks without any major problems were categorised as having successful task performance.

The criteria for determining whether users had an "Electronic Radio Times" based user model were:

Users who had no major problems, or one major problem unrelated to the listing of sessions, and who explicitly used words like "Radio Times", ‘TV guide’ or ‘TV listings’ to explain the functionality of the Daily Listings Window and who implicitly placed importance on the Daily Listings Window by, for example referring to it often or pointing out that to them the Daily Listings Window was an important feature of sdr, showing the Daily Listings Window as the first thing to the co-learner etc., or simply using "Radio Times" related language were classified as having "Electronic Radio Times" based user models. (4 users)
Users who did not explicitly mention "Radio Times" but who implicitly indicated that it was the foundation of their user model as described above, were also classified as having "Electronic Radio Times" based user models (1 user)
Users who explicitly mentioned the "Radio Times" but did not show any implicit evidence, were not classified as having "Electronic Radio Times" based user models. (2 users)

5 (and potentially 7) "new" users had correct user models clearly based on the "Electronic Radio Times" design model. As mentioned above, this is an encouraging result as the "Radio Times" based user models have been influenced exclusively by changing the words appearing in sdr’s interface.