Young Children’s Search Strategies and
Construction of Search Queries
Glenda Revelle, Allison Druin, Michele Platner, Stacy Weng
Benjamin B. Bederson, Juan Pablo Hourcade, Lisa Sherman
Human-Computer Interaction Lab
University of Maryland
College Park, MD
20742
+1 301 405 0154
glenda@umiacs.umd.edu
http://www.cs.umd.edu/hcil/querykids
This paper describes a quantitative study focused on two
questions: (1) Can children understand and use a hierarchical domain structure
to find particular instances of animals?
(2) Can children construct search queries to conduct complex searches if
sufficiently supportedd, both visually and conceptually?
by technology These two
questions have been explored in the context of developing a digital library
interface
(called “QueryKids”) for children (ages
5-10 years old) that visualizes the querying process
and its results. . In this paper the motivation for our
research, the study methods and results will be discussed.The results of this
study showed that children were able to search very efficiently, primarily using a “fewest-steps”
strategy, with the QueryKids software prototype. In addition,
children were able to construct search queries with a high degree of
accuracy. Results are discussed in
terms of the scaffolding support that QueryKids provides, and its effectiveness in
helping children to search efficiently and construct complex search queries.
Children, information retrieval, digital libraries, empirical evaluation, education applications.
Research has shown that the querying process can be difficult for users when the interface is restricting in syntax or abstract in nature [9,12,16,19]. Graphical interfaces for digital libraries have been shown to help adults search efficiently and effectively [1,7,14,17].
The research concerning children and information search strategies, leads us to believe that graphical interfaces can also be supportive of children as technology users [13,26,27]. However, thanks to the importance of the World Wide Web and the proliferation of search engines for it, children typically must negotiate query tools that are language-based and use abstract logical notations for Boolean searches [13]. While the use of text is not an issue for older children and adults, young children (4-7 years of age), have difficulty when it comes to typing skills, spelling, and syntax comprehension [15] [24] [26].
In addition, constructing Boolean-type search queries
requires an understanding of the logic of conjunction (intersection, typically
represented as AND in a standard
Boolean search query) and disjunction (union, generally represented as OR in traditional Boolean search
terms). It has long been understood
that even adults have difficulty with these logical concepts, particularly with
disjunction [4]. It has also been well documented that children have difficulty
with these concepts, and that the differential difficulty of disjunction over
conjunction is consistent for children from 5 to 12 years of age [23]. However,
under certain circumstances even children as young as three years have been
shown to utilize disjunctive concepts to perform significantly better than
chance [18]. Although these results
were all established quite some time ago, there has been little or no research
exploring children’s the use of computer interfaces to
construct search queries based on these logical concepts. Interestingly enough,
it has been shown that typical interfaces to the Web promote less strategic
thinking concerning searches, and more active browsing [13]. We believe this may be due to the
inappropriate searching interfaces available for young children today.
Therefore, we began a study in the fall of 1999, to better
understand young children’s searching strategies and abilities to construct
Boolean-type search queries. At that
time, we hypothesized that if we provided enough visual and conceptual support
for young children, it might be possible for them to effectively use these
complex search concepts.. The empirical study reported here
examined the following questions: (1) Can children understand and use a
hierarchical domain structure to find particular instances of animals? (2) Can children construct search queries if
they are provided with visual and conceptual support? Our research questions
were addressed by observing and documenting children’s searches for animals in
a hierarchical information structure, comparing the use of a paper model and an
interactive computer prototype we now call QueryKids. In the paper that follows, our research methods, results, and
conclusions will be described.
The participants in this study were 106 second and third grade children from Yorktown Elementary School, a public school in Prince George’s County, in the Washington DC metropolitan area. Approximately 52% of the children were Caucasian, 36% were African American, and 22% were Asian or Hispanic. The school serves a lower-middle to middle-class population.
The children were divided into two groups. The first group, a total of 56fifty-six
participants, used a paper prototype (as described in the next sections). This group was made up of 30thirty
second graders (14 females with a mean age of 8 yrs, 1 mo, and 16 males with a
mean age of 8 yrs, 0 mos) and 26twenty-six
third graders (14 females with a mean age of 9 yrs, 1 mo, and 12 males with a
mean age of 8 yrs 10 mos). The second
group, a total of 50fifty
participants, used the computer prototype.
This group was made up of 22twenty-two
second graders (12 females with a mean age of 8 yrs, 0 mos, and 10 males with a
mean age of 8 yrs, 1 mo) and 28twenty-eight
third graders (14 females with a mean age of 8 yrs, 10 mos, and 14 males with a
mean age of 9 yrs 0 mos).

Both the paper prototype and the computer prototype were organized
to represent four hierarchies (TableChart
1). At the top level were the names of
four parallel “branches”: Animals, Where They Live, How They Move and What They
Eat. All 4562
animals in the data set could be found under each of these four branches; i.e.,
the four branches served as alternative ways of accessing the same
information. Under the Animals branch heading were the
following subcategories: Amphibians, Birds, Fish, Insects,
Invertebrate Sea Creatures, Mammals, Reptiles.
The Mammals subcategory was then further subdivided into Cats & Dogs, Rodents, Hooved, Primates, and Marsupials. The second branch, Where They Live, was divided into three subcategories: Land, Water, and Both Land and Water. Likewise, the How They Move branch was subdivided into Fly, Swim, and Walk, Crawl, Hop etc., and What They Eat had the subcategories Eats Animals, Eats Plants, and Eats Both Plants and Animals. Under the lowest subcategories in each branch of the hierarchy were entries for individual animals.
The
paper prototype consisted of a set of hierarchically nested envelopes (see Figure x). The four 15”x12” envelopes at
the top of the four branches of the hierarchy were labeled Animals, Where They Live, How They Move and What They Eat, and
decorated with representative pictures (Figure 1).


Inside each of these
envelopes were smaller envelopes, labeled with the subcategories under each
broad category
(Figure 2).
For the Mammals subcategory there was one more subset of yet smaller envelopes, representing the second level of subcategories. Inside the smallest envelope for each branch of the hierarchy were 5x7 white cards, each of which displayed a color picture of one animal with its common name printed below the picture.
In
addition, there were two cartoon-style illustrations of children on 4 x 6 cards
(Figure 3y).
These illustrations represented Dana and Kyle, who were introduced to the
participants as the “search kids”, and were used in searches for groups of
animals. Whenever children were
constructing a search query to find a group of animals (as described in the Procedures section below), they were
asked to place the envelopes representing those groups on top of the Dana and Kyle
cards.

The computer prototype, (currently called “QueryKids”,)
runs on a Sony laptop computer with a USB mouse under
Windows 98, and was built on top of a software
architecture as a module of KidPad, a collaborative
application for children [3] [6]. Like
KidPad, it makes use of and
Jazz [2], a
Java package that provides zooming and panning capabilities, and MID [10] a Java package that gives it the ability
to obtain input from multiple mice. It runs
on Windows 98 and uses a. A Microsoft Access database was
used to hold metadata about the 4562
animals in the data set.
The
prototype consisted of three areas: two
browsing areas and a search area. Although children
were shown the browsing areas, Oonly the
search area was used in this study. The search area displayed four icons
representing the four main branches in the hierarchy: Animals, Where They Live,
How They Move and What They Eat (Figure
41). Each icon was composed of a text label and
a representative picture.

To
move down through each branch, the user clicks on the “shadow” under one of the
four main icons. To specify search
parameters, the user clicks on the icon or icons representing those
parameters. So, for example, to conduct
a search for “birds that live on land and water”, one might first click on the
shadow beneath the Animals icon to
reveal the subcategories, then click on the Birds
icon to make it a search parameter.
Next, one would click on the shadow below the Where They Live icon, revealing its subcategories, and click on Land and Water to add it as a second search parameter (Figure 52).mammals
that live in water
As
search parameters arewere
selected, their icons moved to the two children in the upper
left corner of the screen. The metaphor
metaphor
as explained to the children in this studyfor
users wasis that
these two children (called Kyle and Dana) are “searchquery
kids”, and that you are “giving” them icons of things that you want them to
find. When items are given to Kyle and Dana, the software runs a query that
automatically performs a union among items selected from subcategories within
the same branch, and an intersection among items selected from subcategories
across different branches. The
subcategories within any one branch have been defined such that they do not overlapitems
are not duplicated across subcategories (i.e. an intersection
would yield an empty set). Thus, the
user does not need to distinguish between intersection and union in specifying
a query, but due to the way the categories and the software searching algorithms have
been structured,
the “intuitive” result will be delivered most of the time.
Any
time an icon is added
or removedselected as a search parameter, the
results of the search are immediately displayed in miniature in the outlined
area to the right of the search kids.
This serves as a “ query preview” area for searches as
they are in progress, and provides immediate, local feedback regarding the
results of the search in progress. The user
may then click on the display area to zoom in and examine the search results.

For a more complete description of the QueryKids computer prototype and its
design and development, see [5].
The children participated in same-sex and same-grade pairs for both paper and computer prototype research.
The participants in the paper prototype group sat on the floor with the four large envelopes arranged on the floor in front of them. The researchers described the task as being like a “treasure hunt”, and explained that inside each envelope there were smaller envelopes and inside those were index cards with pictures of animals that the children would be trying to find.
In the computer prototype group, participants sat at a desk, in
front of athe
Sony laptop with the QueryKids application running. All of the prototype functionality was demonstrated, and children
were allowed a free-play period of a few minutes to experiment with clicking on
icons to see what happened before the experimental procedure began.
For both groups, it was also explained that there were two parts to the research. In the first part, the goal was to find a particular animal, for example, a blue jay. Each child was asked to find four specific animals. The four animals were requested in four different orders, with each animal appearing in each serial position once. The use of these four orders was counterbalanced across prototype condition, grade level and gender groups.
In the second part, the task was to find groups of
animals. To help them find groups of
animals, children were introduced to the searchquery
kids, Kyle and Dana. The participants
were told that Kyle and Dana would find groups of animals when given an envelope/icon
representing that group. Each
participant was asked to construct one single-factor search query (e.g., all
insects), one union search query (for example, all reptiles and all amphibians) and one intersection
search query (e.g., all birds that live on land). The single-factor search was always first, the union always
second and the intersection always third.
There were two different sets of specific groups requested for each of
the three searches, and each of the children in a pair received a different
set.
After the experimental procedure, researchers interviewed the children about their reactions to the task. Children were asked if they thought finding the animals was easy or hard, fun or not, and whether there was anything they would change to make it better or easier.
Two major aspects of children’s search behavior were examined in this study: 1) children’s search efficiency when searching for a specific animal within the hierarchical information structure and 2) their ability to construct a search query.
To develop a measure of search efficiency, children’s responses were recorded when they were asked to find each of the four specific animals in the first section of the study. For the paper prototype group, observers recorded each envelope that the child opened in order. For the computer prototype group, the software logged the sequential history of all mouse clicks. Children’s responses were then coded to indicate how many unnecessary envelopes were opened or icons were clicked. In other words, search efficiency was the number of search steps taken above the minimum number necessary to find the requested animal, given the branch of the hierarchy chosen by the child. Thus, the higher the search efficiency score, the less efficient the search.
Search efficiency scores were submitted to a 2 (grade) x 2
(gender) x 2 (condition) x 4 (item number) analysis of variance, in which item
number served as a repeated measure.
Results of this analysis indicated a significant difference between
conditions, F(1,96) = 14.75, p < .0001, a significant condition by gender
interaction, F(1,96) = 4.75, p < .05, and a significant difference between
items, F(3,288) = 2.92, p < .05.
Means for the groups involved in these effects are displayed in Table 12.
Examination of these means shows that computer searches were significantly more efficient than paper searches. Tukey post hoc tests on the condition by gender interaction indicated that the females’ searches were significantly more efficient in the computer condition than in the paper condition, while there was no significant difference for the males. In addition, comparison of the means in the item effect indicates that children’s searches became more efficient with each subsequent item, indicating a practice effect. An additional analysis indicated that there were no significant differences in search efficiency for one particular animal vs. another.
To quantify children’s search
query abilities, their responses in the second portion of the study were
examined. Their attempts to formulate
search queries to find groups of animals were scored as shown in Table 32. Search query scores range from 0 to 1, with
1 being the highest possible score.
Search query scores were analyzed using a 2 (grade) x 2 (gender) x 2 (condition) x 3 (query type) analysis of variance, in which query type (single-factor vs. union vs. intersection) served as a repeated measure. Results of this analysis indicated a significant difference between conditions, F(1,94) = 14.96, p < .0001, a significant difference between query types, F(2,188) = 3.12, p < .05, a significant interaction between condition and query type, F(2,188) = 7.15, p < .05, and a significant interaction between gender and query type, F(2,188) = 7.15, p < .001.
Means for the groups
involved in these effects are displayed in Table 43.

Examination of these means shows that overall, search queries were more
accurate in the computer condition than in the paper condition. Tukey post hoc tests on the query type
effect indicated that union queries were significantly more successful than
intersection queries, while neither differed significantly from the success
rate for single-factor-searches.
However, this main effect is qualified by two interactions. Post hoc tests on the condition by query
type interaction showed that both single-factor queries and intersection
queries were significantly more accurate in the computer condition than in the
paper condition, but for union queries there was no difference between
conditions. In addition, post hoc comparisons
on the gender by query type interaction demonstrated that for females union
queries were significantly more successful than either single-factor
searches or intersection searches, whereas for males there were no
significant differences between the three query types.
Discussion
In general, children were quite efficient in their searches
for specific animals. The overall
search efficiency mean for the entire sample was 0.48. This means that, on average, children looked
in less than one extra envelope, or clicked on less than one extra icon,
per search beyond the bare minimum needed to find the animal that they were
looking for. So, for the most part,
children successfully employed a strategy of trying to find each target animal
in as few steps as possible, in an extremely focused and goal-directed
manner. In addition, cChildren’s
ability to use this “fewest-steps” strategy effectively improved over time
within the course of the four trials in this section of the research. In addition, children who used the computer
prototype searched significantly more efficiently than those using the
paper prototype.
The one apparent exception to the use of the fewest-steps strategy occurred in the searches of the females using the paper prototype. Their searches were significantly less efficient (i.e., used significantly more extra steps) than the searches of the girls using the computer prototype or the searches of the boys in either prototype condition. It should be noted, however, that the absolute differences in number of extra steps are small: even for the females using the paper prototype the average search efficiency was only 0.89, still less than one extra envelope opened or icon clicked per search.
Qualitative observations of the children as they engaged in
the search tasks led researchers to suspect that a number of the females who
used the paper prototype were intentionally browsing,
rather than engaging in goal-directed, fewest-steps-type searches. They seemed to enjoy looking through all the
pictures of animals as a goal in itself, sometimes continuing to look at animal
pictures even after the target animal had been found. It’s not clear why there was so much less of this intentional
browsing behavior with the computer prototype, but perhaps it was due to fact
that children were working exclusively within the search area of the QueryKids prototype. This area is clearly structured to support purposeful,
goal-directed searches, whereas other sections of the prototype support
browsing.


The second portion of the study focused on
children’s abilities to construct search queries. Once again, overall, children were strikingly adept at this task.! Across the entire sample and all of the
search query types, the average accuracy of constructing a search query was
0.72 of a total 1.00. Moreover, the
children who used the QueryKids computer prototype achieved an 85% accuracy
rate with their search queries, which was significantly higher than
the accuracy of those using the paper prototype.
What accounts for this
surprisingly high level of performance, especially in light of the research
previously cited which has established that children have difficulty with the
underlying logical concepts involved in constructing union and intersection
searches?
We believe that these
positive results are the result of several different kinds of support that were
built into the software as “scaffolding” devices. Scaffolding is a well-established
educational technique that often enables children to complete tasks that
otherwise would be beyond their capabilities [25,28], and has been shown to be
an effective learning tool when used by teachers [21]. Recently, scaffolding has begun to be
incorporated as a learning support in educational software [8,11,20], and there
is evidence to suggest that educational software with extensive scaffolding is
more educationally effective than software without such support [22].
There were several kinds of scaffolding support built into the QueryKids prototype. First, the search interface was visually concrete and involved direct physical manipulation of the search elements, both of which were designed to support children in constructing search queries that they would have been unable to accomplish with a typical text-based search tool.
Second, the display of “in-progress” search results on the same screen, while the search query is being formulated, makes it extremely easy for children to see whether their queries have been formulated correctly or not, and to adjust and modify their queries when needed. This immediate, dynamic feedback is one of the major points of difference between the paper prototype and the computer prototype, and probably plays a large role in the significantly better performance of those children using the computer version.
Finally, and perhaps
most importantly, because of the way the information was organized and the
search software was written, children did not need to distinguish between an
intersection search query and a request for a union search. This lightens the
cognitive complexity of the task immensely, allowing children to first focus
solely on identifying the proper parameters to conduct the search they have in
mind.
We believe that the kind of scaffolding described
here could serve as a first step toward helping children learn to understand
and use Boolean search concepts.
Scaffolding is typically designed to be “eased out” as the child becomes
more and more capable of completing the task with fewer supports. In future work, we plan to research
systematic ways of reducing this support to gradually guide children into
constructing queries with the full power of Boolean logic under their control. In addition, we intend to work with younger
children (ages 6-7) to see whether or not the current prototype will
support their search abilities, and to see how their searching strategies may
differ from those of the somewhat older children in this study.
In summary, this study has shown that even young
children are capable of efficient and accurate searching. With the support of a visual query interface
that includes scaffolding for Boolean concepts, children can use a hierarchical
structure to perform searches and construct search queries that
surpass their previously demonstrated abilities using traditional search
techniques.
This work was supported by the
National Science Foundation’s DLI-2, Discovery Channel, Patuxeant
Wild Life Refuge, and the Baltimore Learning Community project. We thank Prince
George’s County Public Schools for their cooperation and support. The children and teachers from Yorktown
Elementary School were critical to this research. They included 106 children in the 2nd and 3rd
grades along with their teachers. Also,
thanks to Delfin Barral for allowing us to use his illustrations to represent
Dana and Kyle in the paper prototype.
On-going inspiration and intellectual discussion has come by way of Ben
Shneiderman, Catherine Plaisant, Anne Rose, Joseph JaJa, and the KidStory
research project supported by the i3, ESE.
1. Ahlberg, C., Christopher, W., & Shneiderman, B. (1992). Dynamic queries for information exploration: An implementation and evaluation. Proceedings of ACM CHI'92, ACM Press, pp. 619-626.
2. Bederson, B. B., Meyer, J., & Good, L. (2000). Jazz: An Extensible Zoomable User Interface Graphics Toolkit in Java. In Proceedings of User Interface and Software Technology (UIST 2000) ACM Press, (in press).
3. Benford, S., Bederson, B. B., Åkesson, K., Bayon, V., Druin, A., Hansson, P., Hourcade, J. P., Ingram, R., Neale, H., O'Malley, C., Simsarian, K., Stanton, D., Sundblad, Y., & Taxén, G. (2000). Designing Storytelling Technologies to Encourage Collaboration Between Young Children. In Proceedings of Human Factors in Computing Systems (CHI 2000) ACM Press, pp. 556-563.
4. Bruner, J. S., Goodnow, J. J., and Austin, G. A. A study of thinking. New York: Wiley, 1956.
5. Druin, A., Bederson, B., Hourcade, J. P., Sherman, L., Revelle, G., Platner, M., Weng, S. (Submitted) Designing a digital library for young children: An intergenerational partnership. CHI 2001, ACM Press.
6. Druin, A., Stewart, J., Proft, D., Bederson, B. B., & Hollan, J. D. (1997). KidPad: A Design Collaboration Between Children, Technologists, and Educators. In Proceedings of Human Factors in Computing Systems (CHI 97) ACM Press, pp. 463-470.
7. Greene, S., Marchionini, G., Plaisant, C., & Shneiderman, B. (1997). Previews and overviews in digital libraries: Designing surrogates to support visual information seeking. Technical Report CS-TR-3838, UMIACS-TR-97-73, University of Maryland.
8.
Guzdial, M. (1995).
Software-realized scaffolding to facilitate programming for science
learning. Interactive Learning Environments, 4(1), 1-44.
9. Hertzum, M. & Frokjaer, E. (1996). Browsing and querying in online documentation. ACM Transactions on Computer-Human Interaction (TOCHI), 3(2), pp. 139-161.
10. Hourcade, J. P. & Bederson, B. B. (1999). Architecture and Implementation of a Java Package for Multiple Input Devices (MID). Tech Report #CS-TR-4018, Computer Science Department, University of Maryland, College Park, MD, USA.
11.
Jackson, S. L.,
Krajcik, J., & Soloway, E. (1998).
The design of guided learner-adaptable scaffolding in interactive
learning environments. Proceedings of ACM CHI’98, ACM Press,
pp. 187-194.
12.
Jones, S. (1998).
Graphical query specification and dynamic result previews for a digital
library. In Proceedings of UIST 98. ACM
Press, pp. 143-150.
13. Large,
A., Beheshti, J., & Breuleux, A. (1998). Information seeking in a
multimedia environment by primary school students. Library and Information Science Research 20, pp. 343-376.
14. Michard,
A. (1982). Graphical presentation of boolean expressions in a database query
language: Design notes and an ergonomic evaluation. Behaviour and Information Technology, 1(3), pp. 279-288.
15. Moore,
P. & St. George, A. (Spring 1991). Children, as information seekers: The cognitive
demands of books and library systems. School Library Media Quarterly, 19, pp.
161-168.
16. Nickerson, R. S. (1981). Why interactive computer systems are sometimes not used by people who might benefit from them. International Journal of Man-Machine Studies, 15, pp. 469-483.
17. Plaisant, C., Marchionini, G., Bruns, T., Komlodi, A., & Campbell, L. (1997). Bringing treasures to the surface: Iterative designs for the Library of Congress National Digital Library Program. Proceedings of ACM CHI'97, ACM Press, pp. 518-525.
18. Rawson, L. M., Tamayo, F. M. V., Vehle, M. T. & Willemsen, E. W. (1973). Disjunctive concept utilization in preschool children. The Journal of Genetic Pschology, 122, pp. 211-216.
19. Ray, H. N. (1985). A study on the effect of different data models on casual users’ performance in writing database queries. International Journal of Man-Machine Studies, 23, pp. 249-262.
20. Revelle, G. L., Medoff, L., and Strommen, E. F. (2000). Interactive technologies research at Children’s Television Workshop. In S. M. Fisch & R. T. Truglio (Eds.), “G” is for growing: Thirty years of research on Sesame Street. Mahwah, NJ: Lawrence Erlbaum Associates, pp. 215-230.
21. Rogoff, B. (1990) Apprenticeship in thinking: Cognitive development in social context. New York: Oxford University Press.
22. Shute, R., & Miksad, J. (1997). Computer assisted instruction and cognitive development in preschoolers. Child Study Journal, 27(3), 237-253.
23.
Snow, C. E. &
Rabinovtch, M. S. (1969). Conjunctive and disjunctive thinking in children. Journal of Experimental Child Psychology, 7,
pp. 1-9.
24.
Solomon, P. (June
1993). Children’s information retrieval behavior: A case analysis of an
OPAC. Journal of American Society for Information Science, 44, pp.
245-264.
25.
Vygotsky, L. S. (1978)
Mind in society: The development of
higher psychological processes. Cambridge,
MA: Harvard University Press.
26.
Walter, V. A.,
Borgman, C. L., & Hirsh, S. G. (Winter 1996). The Science Library Catalog:
A springboard for information literacy.
School Library Media Quarterly,
24, pp. 105-112.
27. Watson, J. S. (1998). If you can’t have it, you can’t find it: A close look at students’ perceptions of using technology. Journal of the American Society for Information Science, 49, 1024-1036.
28.
Wood, D., Bruner,
J., & Ross, G. (1976) The role of
tutoring in problem solving. Journal of Child Psychology & Psychiatry
& Allied Disciplines, 17(2), 89-100.