| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Literature Review

Page history last edited by Heather Lowe 14 years ago

Below is a rough draft of a literature review written by Heather Lowe and surveying some of the research related to the current search log project. The review is meant to serve as a spring board from which the group might consider the best practices for search engine logging, questions that can or cannot be answered by search log analysis, and whether supplemental data collection is necessary.


 

1. Introduction:

ARTstor, an initiative created in 2001 by the Andrew W. Mellon Foundation, is a digital archive of images from museums, archives, and other institutions. Recently, ARTstor released to researchers for analysis its transaction log from its online service that dates back to 2004. Because the database is fairly unique in its character and user-base, little directly repeatable research has been conducted to guide the process of analysis for a topical image resource with the same breadth as ARTstor. The purpose of this paper is to review the relevant research on a variety of topics that may relate to the particular concerns of this type of analysis. First, transaction log analysis (TLA) will be defined and motivations for conducting TLA will be discussed as well as problematic issues within the method. The following section will delve into methods used to conduct transaction log analysis: how data can be prepared and what types of models and algorithms can be used for analysis. The paper then moves to separate image-based information retrieval from text-based in order to give a context with which to compare the ARTstor data to the majority of existing studies using transaction logs. Finally, the paper will outline recommendations for best practices within the realm of museum and visual resources transaction logging.

 

2. Transaction Log Analysis

2.1 Definition

Transaction logs have been an area of growing interest since the late 1960s (Meister & Sullvan, 1967), so researchers have adapted their definitions over this period according to their needs and technological advancements. More recently as transaction log analysis has become a widely used process, Jansen (2006) gives a concise definition of the practice as confined to web searching as, “an electronic record of interactions that have occurred during a searching episode between a Web search engine and users searching for information on the Web search engine” (p.408). Though broad, this definition serves to include both large, general search engines such as Google as well as searching mechanisms within a specific web site or resource. The interactions recorded are “communication exchanges that occur between users and the system” (Jansen, 2006, p. 408). Users may be defined as software or humans, and extracting human actions from the mechanized searching is crucial for obtaining accurate results.

The level of information collected in transaction logs varies greatly depending upon the context. For example, some system transaction logs might record all interactions within the system such as log-in and log-off times, queries, results, clicked results, and other details of actions while simple transaction logs may be limited to only a time stamp, IP address and query string. Penniman and Dominick (1980, p.23) classify the granularity with which data can be collected as follows:

  • General session variables: the least amount of data collected limited to broad actions such as databases used, query string, number of results and other simple data.

  • Function/state traces: the transaction log uses predetermined categories to classify the data as it is being recorded.

  • Complete protocol: the highest level of granularity recorded includes all user interactions with a system. In addition to query data, actions captured might be the types of tools used, results viewed, and a time stamp for all of these.

How much data a transaction log records will vary with the character of the overall system. For example, privacy concerns may be more important in some information retrieval settings as compared to others, translating into different levels of information captured about the user.

 

2.2 Reasons for conducting research

Because transaction logs contain an unselfconscious record of search queries, many questions regarding the searching process may be investigated via a means that might not be replicated within controlled experiments within which users are aware of the observation. In part the study of transaction logs is attractive due to the scale of information available. The exact scope of research varies depending on the specific researcher’s goals, but these probes offer promise for the improvement of design for information systems, training for information technology literacy, and understanding of information-seeking behaviors. Out of the literature of transaction log analysis foci emerge: detecting the nature of the queries themselves, drawing out relationships among documents and queries, and determining the character of information seeking within web environments.

2.3.1 Term level analysis

Determining linguistic patterns and associations among words can aid designers in creating relational and ranking algorithms. Term frequencies, arguably the least informative category of analysis, reveal top query terms and have the potential to show variations in searcher’s interests over time. Potentially more informative than simple term frequency is determining term co-occurrence (Jansen, Spink & Saracevic, 2000; Wolfram, 2000). Algorithms testing term co-occurrence can be particularly helpful in creating an automated suggestion tool for searchers (e.g. when the user types car, a selection box pops up with “car sales, car ratings, used car”).

 

2.3.2 Query level analysis

Among the motivations for structural query investigations is also the investigation of the syntactical structure of queries. This includes investigations into whether searchers employ Boolean operators, special characters, and the form and length queries take. Across many types of information retrieval systems, there is a relatively low use of operators with the exception of traditional IR environments (Hölscher & Strube, 2002, ~2% web users) (Jansen, et al 2000, %8 web users), (Siegfried, 1993, %37 in traditional IR environments)(Bendersky & Croft, 2009). A surprising exception to this trend might be the Jones, et al. (1998) digital library study that found over a quarter of users employed Boolean operators. In fact, the trend is so consistently observed in web environments that in a literature review Markey questions whether it still retains its relevance (2007). However, the appearance of Boolean operators within queries in some studies aides clustering of user or session types (Bendersky & Croft, 2009; Wen, Nie & Zhang, 2002; Wolfram, Wang, & Zhang, 2009).

Bendersky and Croft suggest that the form queries take may indicate the level of difficulty users have meeting their informational need. In particular they argue that long searches (e.g. over five terms in length) are indicative of user struggle. Other studies focus on natural language searches such as those prompted by sites like AskJeeves. Inquiry into the uses of natural language could enlighten the ways in which some search engines treat question indicators (e.g. when, who, where) that normally function as stop words (Wen, Nie & Zhang, 2002).

 

2.3.3 Session level analysis

Modeling of general user behaviors can be achieved with quantitative methods. These models include forming generalized average characteristics of search sessions as in the Jansen and Spink (2005) study comparing nine different search engines. Baeza-Yates et al. (2005) use click data and query submission within single sessions to model average session length, reformation rates, and click rates. Other studies try to cluster sessions into meaningful groups through examining similarities in traits. Wolfram et al (2009) pinpoint three distinct session types across three searching environments: an academic site (University of Tennessee, Knoxville website), a specific field resource (Healthlink), and a general search engine (Excite). Sessions could be identified as short queries in short sessions (generally using popular search terms), long queries in short sessions, and long queries in long sessions.

 

 

2.3.4 User goals

Stemming from studies focusing on session characteristics, some studies make an effort to automatically identify user goals. Quite often these goals draw from or at least mention the taxonomy of web search laid out by Broder (2002): navigational, search to find a site URL; informational, search to seek some kind of information; and transactional, search to perform an interaction within the space of the internet (e.g. email or shopping). Studies like Jansen, Booth & Spink (2008) and Rose & Levinson (2004) use manual classification to evaluate user goals based on the query itself and the corresponding click data.

One study tried to form a method of automatically identifying user goals; Lee, Liu & Cho (2005) base their algorithm on where the query terms were anchored within a site and how many clicks occur within the results. If the anchor link is a homepage, the query is determined to be navigational, if on a content page, informational. From click-through data, goals were determined to be navigational when the user clicked on just one result. For simplification, the goals were limited to navigational and informational only. Likewise, Baeza-Yates, Calderón-Benavides & González-Caro (2006) focus on automatically identifying whether searches are informational, non-informational or ambiguous through supervised and unsupervised machine learning.

 

2.3.5 Query Clustering Using Session Data

Though clustering techniques are often used to determine the nature of different types of search sessions, finding methods of clustering search terms around a particular topic or group of documents can aid search engines in retrieving materials that might not contain the exact phrasing of the user query. One such study, Wen, Nie & Zhang (2002) designs a method of clustering queries around the documents clicked within results lists deeming that if two different queries were answered by the same document, it is likely those queries are related. Other studies focus on the clustering of search results. Wang & Zhai (2007) use OKAPI similarity testing to group search results from a query in a method called star-clustering that breaks the results lists into more narrow topical sections.

 

2.3.6 Environmental factors’ effect on behavior

 

Though server-side transaction log analysis, in general, does not allow for the use of a true control group to measure against, the effectiveness of certain types of web environmental factors may be measured. Anick (2003) studies the effect of providing terminological feedback on users’ uptake and success rates by comparing users with access to the tool to a control group who were not offered the feedback tool. Though the study finds that only a minority of all searches employ the refinement tool, when the search pool is restricted to any refined search or to users who had used the tool at least once, it is suggested that a significant portion of users found the tool useful.

 

2.3.7 Temporal/longitudinal studies

Yet another class of transaction log studies are those which measure change in behavior or query characteristics during different periods or over a length of time. Bietzal et al (2004) compare search query characteristics for each hour of the day. However, at least one report (Hochstotter and Koch, 2009) questions the validity of such temporal sampling as they argue that, in contrast to the scale of web searches, such studies take such a small sample that samples are likely to be skewed. Instead Hochstotter and Koch champion the notion of evergreens, terms that repeatedly occur within samples over a long period of time, as a reliable indicator of long-term trends (2009, p.59-60).

 

2.3.8 Multimedia searching

Studies with a specific emphasis on multimedia searches are far scarcer than studies of general web search engines. In the few studies that have occurred already, there are significant signs of difference in the practice of multimedia searches. Goodrum and Spink (2001) found that in general queries for images were longer than those submitted to general search engines and had a higher rate of query reformation. These findings confirming a previous study (Jansen, Goodrum and Spink, 2000) in which image and audio queries were observed to be longer on average. As observed in other realms of web search study, searches of explicit sexual content were high and many of the most frequently submitted terms could be interpreted to qualify or correspond to these queries (Goodrum and Spink, 2001).

The Goodrum, Bejune, & Siochi (2003) study confirmed what information seeking surveys suggested (Frank, 1999; Bates, 2001; Cobbledick, 1996) that there is a high rate of browsing used in seeking images. Jörgensen and Jörgensen (2005) further investigate the methods used to search for images with particular focus on how queries are reformulated whether this includes additions or reductions in terms, querying by example, or the use of terms within the records in results lists.

 

 

2.3.9 Problems

As alluded to over the report of previous research, transaction log analysis does pose some problems including the collection of data divorced from its original context. Other problems arise due to the fact that most transaction logs are captured on the server-side of computer interactions. The use of browser and server caching, firewalls, and proxy servers all obscure the accuracy of server-side logs. Caching, which allows users to access a page from stored memory either on the user’s computer or on a proxy server, obstructs certain user actions from being recorded in the log. Yun et al. (2006) confirmed the gap in the server-side records by comparing them to logs captured on the client-side revealing a 49% reduction in stored data on the server-side logs. Though client-side logs are more difficult for researchers to attain, several studies employ this strategy (Hert & Marchionini, 1998; Thatcher, 2006; White & Drucker, 2007).

Another difficulty in transaction logging is the lack of consistent unique user IP arising out of the practice of firewall and proxy server use. This creates a problem for identifying session boundaries with a great deal of certainty. Potential remedies for this may lie in restricting data to that of sessions which require a unique log-in or the use of cookies (Yun, 2009, p. 176; Wolfram et al. 2009, p.900; Jansen, et al. 2007). Methods for determining session boundaries from server-side logs in spite of these difficulties will be discussed in the methods section.

Beyond the technological limitations of transaction log analysis, it is criticized for only a limited view of the information-seeking process (Jansen, 2006). Therefore many studies may choose to combine transaction log analysis with other types of research including interviews, focus group discussions, and personal logs to answer questions that fall outside the range of TLA alone, and this is the suggested approach by Spink & Jansen (2004, p.38-9).

 

3. Transaction Log Methods

 

3.1 Overview

 

Jansen is a predominant figure in the articulation of transaction log analysis. His 2006 paper outlines methodologies for carrying out such studies. Jansen breaks the research process into three stages: collection, data preparation, and analysis. Jansen suggests that research considerations should determine which fields are recorded within the transaction log; however, this is often constrained because the transaction log should be unobtrusive to the user in order to gain real-life, non-laboratory data or the transaction log may have been created long before the current inquiry. Some fields extremely helpful to research such as a specific user log-in id are often unobtainable and in many cases may be considered a privacy concern. While no standard list of transaction log elements exists, some terms have emerged as common data fields (Wang, et al. 2006):

  • Query: search statement or search URL

  • Time stamp: date and time of interactions’ occurrence

  • User identification: IP address, randomized number representing a unique IP address, cookie

  • Click through: the order and time at which a user clicked on search results

These fields should be applicable across various types of web interfaces whether general search engines or unique resources.

Data preparation, the second stage of transaction log analysis is crucial to interpretation, but few studies report their models at sufficient length as to allow for replication. Preparing the data for analysis generally begins by moving the data into a relational database. Of the studies published that explain their models for data preparation, Wang et al (2006) identifies four relational models:

Table . Relational models in previous transaction log analysis studies

Entities

Relationships

Source

Original query

Cleaned query

Token

Word pair

Binary

Wang, Berry, & Yang (2003)

Searching episode (query)

Terms

Cooc (term pairs)

Unary

Ternary

Jansen (2006)

Query Instance (query)

Query term

Keyword (token)

Query session (derived term)

Multiple relationships may be derived from this model including popularity

Baeza-Yates, et al (2005)

Query

Term

Token

Token serves as the relationship between the query and the term

Wolfram (2006)

Query

Click

Webpage

Unique query (derived)

Query token (derived)

Unique token (derived)

Data elements also serve as relationships

Wang et al(2006)

 

As the above table indicates nomenclature as well as methodology are not consistent throughout studies. Additionally, according to the research questions and views of best practices, researchers weighted some derivable relationships more heavily than others. For example, Baeza-Yates et al (2005) includes popularity as an important facet of analysis while Wang suggests that, “unique queries are the linguistic expressions of searcher’s information needs and should not be influenced by popularity or frequency of the query occurrences” (2006, p. 3). Such discrepancies among research perspectives may point to the reason that a solidified best practices for this type of analysis has yet to be developed.

 

3.2 Analysis

Though no set of standards exist for analyzing data that might be applied across all information retrieval systems, the growing body of work on transaction log analysis serves as reference points from which to design a study. While each study may vary in its parameters, there are still overlapping areas of relevance for replicating methodology. The following section reports various methodologies for sampling, determining session boundaries, term analysis, query analysis, and manual analysis.

 

3.2.1 Sampling

Though cost of storage and processing power are seldom a constraining factor today as they were at the onset of transaction log analysis, a robust method of sampling may still be necessary in some studies. Ozmutlu, Spink, Ozmutlu (2002) point out that choosing a fixed interval sampling method is insufficient for sampling from transaction logs since these logs include patterns of activity. To ameliorate this problem, they propose a poisson sampling method determining that this produces a more representative data set.

Gayo-Avello (2009) describes a method of obtaining representative samples for manual classification. The following equation is used to identify a representative sample size:

Figure 1 (see uploaded file in the 'Images and files' link to the right)

Because manual classification of queries is often quite time-consuming, Gayo-Avello reduces the confidence level to 95.5% in order to produce a more manageable sample size (2009, p. 1830).

 

3.2.2 Determining session boundaries

The best method for determining session boundaries is possibly the most contested issue within transaction log analysis methodology. Gayo-Avello (2009) provides an excellent overview of the various methods for determining session boundaries; importantly, he notes that the nature of information seeking itself is an evolving process with fuzzy start and end points, “Thus, any definition of session should take account of this iterative and evolving nature, in addition to the underlying existence of user goal” (p. 1825). However, given the difficulty identifying a particular user’s starting and stopping point, a cutoff time is often used to help differentiate sessions with the same user ID: 5 minutes (Silverstein, et al, 1999; Downey, et al., 2007), 15 minutes (Jansen and Spink, 2005), 30 minutes (Downey, et al, 2007; Radlinski and Joachims, 2005), and 120 minutes (Montgomery and Flaustos, 2000). In addition to demarcating user sessions, cutoff time is sometimes seen as a way to discriminate between human and computer users (Jansen, Spink, & Pedersen, 2005; Silverstein et al., 1999).

To gain greater accuracy many studies employ methods beyond temporal limits and rely on contextual clues to identify sessions. He, Göker, and Harper (2002) use the time between certain types of user actions (called GBA for gap between activities in their report) to indicate sessions. They use the Dempster-Shafer theory which “assumes that there is a fixed set of mutually exclusive and exhaustive hypotheses or propositions” and then gives each of these a probability and Dempster’s rule combines these (p. 737-738).1 He, Göker and Harper do this in a way that might be calculated on the fly and therefore suggest its potential use in live searching (p. 738). Özmultu and Çavdur (2005) replicate these results in a study focusing on an Excite search engine log but warn that there are multiple issues with consistent replication. Within this search method, several types of sessions are identified: browsing, generalization, specialization, reformulation, repetition, relevance feed back and unidentified (others)( He, Göker, & Harper, 2002, p. 734). Gayo-Avello cautions that this method may produce inaccurate results because it is based upon mutual terminology that might group homonyms and miss topically related queries (2009, p. 1827).

Murray, Lin and Chowdry (2006) use a method that groups user id actions over time using a minimum of 20 queries per user and that then pinpoints gaps of activity within that subset to identify sessions. Wolfram et al (2009) build upon this method with slightly different requirements according to their dataset setting a user snapshot of one-day with the time interval for pinpointing activity gaps based on determinations from earlier work (Wang, et al 2007).

Jansen, et al (2007) test three methods of identifying sessions including: IP and cookie; IP and cookie plus a temporal cutoff; and IP and cookie plus contextual evidence of a session shift. Through comparing the three methods, they find the use of contextual information to perform the most accurately of the three methods.

Finally two different studies propose a more holistic approach to determining session boundaries. Shen, Tan and Zhai (2005) propose a method of incorporating the body of search results to compare query similarity using cosine similarity function. Shi and Yang (2006) investigate a method they call the dynamic sliding window method that allows for the setting of minimum and maximum temporal cut-offs. Within this window, the sequential queries are then examined for similarity to further refine sessions into topically grouped sessions. Gayo-Avello (2009) introduces his own method of detecting sessions that he calls the geometric method. He creates a curve between term similarity and session cutoff limit (e.g. point A equals completely dissimilar queries submitted simultaneously, point B equals identical queries submitted at the temporal limit). Individual sessions would be any sequences that fall beneath this curve.

Though emerging session identification techniques differ in their strategy, the trend is toward incorporating both temporal and contextual clues to evaluate the boundaries of sessions. Gayo-Avello tests several of the aforementioned methods and finds his new geometric method to be superior in terms of precision, recall, F-measure, ERR and SER.

 

3.2.3 Term analysis

Jansen and Pooch (2001) define a term “as a string of characters separated

by some delimiter such as a space, a colon, or a period” (p. 244). Though this seems straightforward, there are still some considerations that will need to be made in order for term analysis to be effective and reflective of the information retrieval system being evaluated. For example, there is the difficulty of carry over language. Carry over language are types of querying syntax that may not be supported the system (e.g. Boolean operators, special operators: ?, *, etc). Jansen and Pooch (2001) draw attention to the problem of whether to leave in or throw out Boolean operators because Boolean operators and characters may mask what the actual search engine is doing with query terms. However, doing so may be risky because you cannot determine how the user intended such operators to be used (whether as a conjunction or an operator).

Independence testing and correlation coefficients are generally used to determine the strength and correlation of term relationships. One of the earlier studies of web log term co-occurrence is Wolfram (1999). This study utilized existing models of distribution within the field of information retrieval including: Zipf, Mandelbrot Zipf, and Shifted Generalized Waring (p440).2 However, none of the data models accurately represented the distribution of co-occurring terms (p.448). Additionally, Wolfram noticed that the co-occurrence for less frequently used terms was less accurate than that of high frequency terms. Another 1999 study by Silverstein, et al, bypasses existing information retrieval models in favor of correlation and independence tests. Noting cost of storage as a limiting factor in how research could be conducted at the time, Silverstein et al, only analyzed the 10,000 most frequent terms from the query set limiting data to fields and terms (1999, p. 10). Silverstein et al use chi-squared (χ2) statistic to determine a correlation and augment this by a correlation coefficient (ρ) in order to determine the strength of a correlation.

A later study by Wang, Berry, and Yang (2003) outline a different method of weighting term relationships. Rather than using correlation coefficients or chi-squared, they use a mutual information formula which determines the strength of correlations no matter the frequency with which the terms occur given below (p746):

 

Figure 2. (see uploaded file in the 'Images and files' link to the right)

 

Here P(w1,w2) is the frequency of both words appearing together regardless of order. P(w1) and P(w2) are the independent frequencies of word 1 and word two. The team does so because they believe some infrequent word pairs may have very high correlations.

 

3.2.5 Query association

One of the first studies to move beyond keyword association in determining related clusters of queries is Wen, Nie & Zhang (2002). The incremental density-based method was chosen because it allowed for the following parameters: no pre-determined amount of clusters, ability to handle incremental data, noise filtration, and timely processing of large data sets (Wen, Nie & Zhang, 2002, p.66). Several options for assessing similarity were then made available within the analysis tool including: keyword, string matching, cross-reference with a single document, hierarchical document cross-reference, and a combination of the linguistic and cross-reference methods.

As an alternative to using more traditional associative means, Zhang and Nasraoui (2006) suggest a search frequency-based means of determining relationships between queries. The first step in the process involves relating terms by distance in sequential queries within a session. The more closely searches occur together, the more similar they are presumed to be. In the comparison equation rather than use the TF*IDF (term frequency * inverse document frequency) to weight terms, SF*IDF (search frequency * inverse document frequency) is used (p.1040). This type of comparison may be useful if applied to the ARTstor search data due to the relationships that exist among artists, works, and styles.

 

3.2.6 Qualitative methods

 

Not all research inquiries utilizing transaction logs can accomplish their goals with automated analysis. Research that involves manual identification frequently includes user goal/task identification and query categorization. Some studies use manual classification as a control group to which automated analysis might be compared (Gayo-Avello, 2009).

Manual classification is often focused on drawing out types of behaviors, as in Hastings (1999) that manually classifies search queries in conjunction with online surveys to determine levels of complexity in searching. With a focus on access identification, Trant (2006) analyzes the Guggenheim’s online collection search log, manually classifying all terms with a frequency of 10 or greater into classifications that might roughly correspond to types of fields within a museum catalog: artist, title/subject of work, style, object, information, genre, ambiguous, place, materials, and date. However, no discussion of why these classifications are used is given. The study further manually classifies queries deemed unsuccessful (e.g. queries with no returned results). These categories raise some suspicion as they generally posit the fault on the searcher rather than the system: wrong period, within the scope but not in the catalogue, not in the catalogue at the time of search, obscenity, not collection related, spelling error, wrong collection. An example of how this taxonomy might be problematic is represented in the fact that a search for bird in the catalog would return 2 results while a search for birds returns nothing. It is unclear how one should classify this unsuccessful search.

 

4. Context for ARTstor: Image Information seeking

The study of an art-image database’s transaction log is attractive, largely because the search for images presents many difficulties not found within the search for text-based documents. This section presents a background on the problems faced within cataloguing and access of art images.

 

4.1 What makes image retrieval difficult?

Over the last few decades the ease with which one can access images and add them to various projects has increased the demand for images. Frost et al. (2000) found that within an academic image collection, users report that they seek images for use in “research, teaching, exhibition planning, collections management purposes, and public outreach as specific activities that involve the use of images, and spent considerable time discussing the use (and relative merits) of images in classroom situations” (p. 293).However, finding these resources for such uses can be problematic. Unlike text that includes subject related terms that might be searched within its body, images exist without these embedded subject clues. Furthermore, the metadata connected with an image meant to alleviate this problem is unlikely to encompass all possible descriptions.

Many scholars have pointed to the difficulty in accurately cataloging many of the more subjective qualities of visual images (Shattford, 1986; Svenonius, 1994; Layne, 1994; Krausse, 1998; Chen & Rasmussen, 1999). Analysis on the cataloging end of image retrieval systems even disagrees on the distinctions between the types of descriptions that might be contributed to a visual image. Some scholars (Shattford, 1986) use Panofsky’s distinction among the pre-iconographic, the iconographic and the iconological. The pre-iconographic consists of the “factual (‘ofness’) or expressional (‘aboutness’)” (Chen & Rasmussen, 1999, p.293). Iconographical description requires an immersion in the symbols and themes within the culture from which the image originates. While this level might take expert cataloguers, it might also be argued that this type of description is important to subject specialists such as art historians or curators. The iconological description relies on the ability to interpret not only the iconographic material within an image but also in-depth knowledge of cultural significances, artistic media, and context of a work. While it might make sense to try distinguishing all three levels of interpretation within an image from the Renaissance that served as Panofsky’s model, more contemporary works do not rely on the same conventions and lack such iconography. Instead, Shattford suggests using a type of faceted and investigative strategy for the description of images including “general of,” “specific of” and “about” using the questions who, what, when, and where for each category (1999). While this strategy may not map well into a museum catalog or image database, it may be helpful to consider these strategies of description when examining search queries.

 

4.2 How do users search for images?

How people search for images has been an area of interest predating the widespread access to image databases or search engines. Some studies include considerations for visual recall and seeking of images as well as content-based retrieval. Mechanical content-based retrieval practices have little relevance for the current ARTstor search model since artistic images are often retrieved based on their subtle subject matter. It may be important to note that while the majority of images within the ARTstor repository are those of the visual and performing arts, there is a growing portion of what might be considered as pertaining to the general humanities and social science. This shift is represented by recently acquired digital collections such as the Eyes of the Nation Collection or the Magnum photo collection both predominantly consisting of photographs of historical and social significance.

Within the studies of traditional collections of art images, many parallels exist between both experts (e.g. those with graduate level degrees in studio art, art history, museology, etc.) and non-experts. Subject and artist image access are prevalent (Frost, 2000; Layne, 2002;). Other popular search types among experts include provenance or style. Hastings chooses to separate queries by complexity rather than access point type in the following table (p. 447):

Level of complexity

Queries

Access points

Computer manipulation

1. Least complex

Identification queries: who, where, when

Text field and image in general

Search, sort, display

2. Complex

Queries of the type what are?

Sorted text information and images

Search, select, sort, display, enlarge

3. More complex

Queries of style, subject, how, objects or activities

Style, keywords, complex images

Compare, mark, resolution, and style

4. Most complex

Queries of meaning, subject, and why

Style and subject

Style and subject searches, access to full text secondary sources

 

While Hastings’ distinctions between the levels of complexity seems overly ambiguous, when evaluating searching behaviors or search success a standard for complexity may be required to answer such questions as “what kind of query is most successful?”

Classifying such complex queries in a useful manner might require manual classification and agreement among researchers. Chen (2001) tries to address the problem by examining the levels of agreement across classification categories among 1, 2, and 3 researchers across the classification scales of two well known studies using query classification: Enser & McGregor (1992) and Jorgenson (1995). Enser & Gregor divide queries into the groups: unique (e.g. specific person, Barack Obama), unique with refiners (e.g. unique place with dates, Trafalgar Square, 1941-1945), nonunique (e.g. dog), and nonunique with refiners (e.g. motorcycles in World War I). Such classifications might make the most sense when applied to general image databases, but may become less useful when applied to art images. Jorgensen’s classification system breaks queries into twelve groups: literal object, people, people-related attributes (e.g. feelings, social status), art historical information, color, visual elements, location, description, abstract concepts, content/story, external relationships, and viewer response. Chen found that Enser & McGregor’s classifications were agreed upon about 73% of the time by 3 researchers and Jorgenson’s about 70% of the time. These findings suggest that these classifications may be appropriate for further studies even though they are subjective.

 

 

5. Recommendations for best practices for transaction logs

As can be seen from the report of current transaction log analysis, much of the research requires more information from a transaction log than the ARTstor data currently provides. Part of the focus of this investigation is to determine how online image databases whether singular museum collections or large aggregated repositories might improve their data collection and analysis of image search queries and behaviors. With more robust systems of collection, real-life queries submitted to such resources might have a greater impact on our understanding of how and why people search for images.

Currently out of the most common TLA elements, the ARTstor transaction log records user id, time stamp, and query string. Additionally, a session ID, originating institution, type of search, and general information on the collection searched. Wang et al (2003) recommends a minimum of user identification, timestamp, query, and click through data. Adding click through data or first page results lists would allow for greater investigation of relationships among queries leading to potential for better clustering algorithms.

Because access to services like ARTstor depend largely upon proxy servers, the use of cookies is essential to aid in any kind of session identification. Though it is currently unclear how ARTstor defines its session ID, specialized session identification might allow for more reliable means of determining session boundaries, as cookies may be unreliable for shared computers.

Even with the addition of common elements found in other search logs, some problems may persist in data collection for art image resources. For example, the practice of dynamically creating web pages may have a profound effect on the functioning of algorithms and reliability of data logging. However, the current state of transaction log practice seems insufficient to answer many of the questions that remain about how users search for images.

References

Baeza-Yates, R. et al. (2005). Modeling user search behavior. Proceedings of the Third Latin American Web Congress. IEEE Computer Society: Washington, D.C.: 242 – 251. http://doi.ieeecomputersociety.org/10.1109/LAWEB.2005.23

 

Baeza-Yates, R., Calderón-Benavides, L., González-Caro, C. (2006) The intention behind web queries. In Proceedings of STRING PROCESSING AND INFORMATION RETRIEVAL (SPIRE 2006). Glasgow, Scotland, 98-109.

http://www.springerlink.com/content/y1287823117n33p8/fulltext.pdf

 

Bates, M. (2001). Information needs and seeking of scholars and artists in relation to multimedia materials. Report submitted to the Getty Research Institute in 1999. http://www.gseis.ucla.edu/faculty/bates/scholars.html

 

Bendersky, M., & Croft, W. B. (2009). Analysis of long queries in a large scale search log, Proceedings of the 2009 workshop on Web Search Click Data (pp. 8-14) http://portal.acm.org/citation.cfm?id=1507509.1507511 Barcelona, Spain: ACM.

 

Beitzel, S. M., Jensen, E. C., Chowdhury, A., Grossman, D., & Frieder, O. (2004). Hourly analysis of a very large topically categorized web query log. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval(pp. 321-328). Sheffield, United Kingdom: ACM. doi:http://doi.acm.org/10.1145/1008992.1009048

 

Broder, A. (2002). A taxonomy of web search. SIGIR Forum, 36(2), 3-10 http://portal.acm.org/citation.cfm?id=792550.792552

 

Chen, H-L. (2001) Analysis of Image Queries in the Field of Art History. Journal of the American Society for Information Science and Technology, 52(3):260 – 273; 2001.

 

Chen, H-L., & Rasmussen, E. (1999) Intellectual access to images. Library Trends 48(2), 291-302.

 

Chen, H-M., & Cooper, M.D. (2001). Using clustering techniques to detect usage patterns in a Web-based information system. Journal of the American Society for Information Science and Technology, 52(11), 888–904. Retrieved January 18, 2010 from

 

Cobbledick, S. (1996). The Information-Seeking Behavior of Artists: Exploratory Interviews. Library Quarterly,66(4): 343-372.

 

Downey, D. et al. (2008). Understanding the relationship between searchers queries and information goals. Proceedings of the 17th Conference on Information and Knowledge Management (pp. 449 – 458) Napa Valley, California: ACM

 

Downey, D., Dumais, S., and Horvitz, E. (2007). Models of searching and browsing: languages, studies, and applications. In, Proceedings of the IJCAI (pp. 1465– 1472.)

 

Enser, P.G.B., & McGregor, C.G. (1992). Analysis of visual information retrieval queries. Report on Project G16412 to the British Library Research and Development Department. London: British Library.

 

Frank, P. (1999). Student artists in the library: An investigation

of how they use general academic libraries for their creative needs. The Journal of Academic Librarianship, 25(6): 445–455

 

Frost, O., et al. (2000). Browse and search patterns in a digital image database. Information Retrieval 1(4): 287 – 313.

 

Gayo-Avello, D. (2009). A survey on session detection methods in query logs and a proposal for future evaluation. Information Sciences, 179(12), 1822-1843. doi:doi: DOI: 10.1016/j.ins.2009.01.026  

 

Goodrum, A., Bejune, M., & Siochi, A.C. (2003). A state transition analysis

of image search patterns on the Web. In T.S. Huang, M.S. Lew, N. Sebe,

& X.S. Zhou (Eds.), Proceedings of the Second International Conference

Image and Video Retrieval (CIVR ’03): Lecture notes in computer science

2728 (pp. 281–290). Karlsruhe/Berlin: Springer.

 

Goodrum, A., & Spink, A. (2001). Image searching on the EXCITE Web search engine. Information Processing & Management, 37, 295–312.

 

Hastings, S. (1999). Evaluation of image retrieval systems: Role of user feedback. Library Trends, 48(2): 438-452.

 

Hert, C. A., & Marchionini, G. (1998). Information seeking behavior on statistical websites: Theoretical and design implications. In C. M. Preston (Ed.), Proceedings of the 61st ASIS Annual Meeting (pp. 303–314). Medford, NJ: Information Today.

 

Hölscher, C., & Strube, G. (2000). Web search behavior of Internet experts

and newbies. International Journal of Computer and Telecommunications

Networking, 33(1-6), 337–346.

 

Hochstotter, N., & Koch, M. (2009). Standard parameters for searching behaviour in search engines and their empirical evaluation. Journal of Information Science, 35(1), 45-65. doi:10.1177/0165551508091311  

 

Huang, Z., Ng, J., Cheung, D.W., Ng, M.K., & Ching,W.K. (2001). A cube model and cluster analysis for Web access sessions. In Proceedings of WEBKDD 2001, pp. 47–57.

 

Jansen, B. J. (2006). Search log analysis: What it is, what's been done, how to do it. Library & Information Science Research, 28(3), 407-432. Retrieved January 16, 2010 from http://dx.doi.org/10.1016/j.lisr.2006.06.005.

 

Jansen, B. J., Booth, D. L., & Spink, A. (2008). Determining the informational, navigational, and transactional intent of Web queries. Information Processing & Management, 44(3), 1251-1266 http://dx.doi.org/10.1016/j.ipm.2007.07.015

 

Jansen, B.J., & Spink, A. (2005). How are we searching the World Wide Web?: An analysis of nine search engine transaction logs. Information

Processing & Management, 42(1), 248–263.

 

Jansen, J., and Spink, A. (2006). How are we searching the world wide web?: a comparison of nine search engine transaction logs. Information Processing and Management 42 (1):248-263.

 

Jansen, B.J., Spink, A., & Saracevic, T. (2000). Real life, real users, and real needs: A study and analysis of user queries on the Web. Information Processing & Management, 36(2), 207–227.

 

Jansen, B. J., Spink, A., Bateman, J., & Saracevic, T. (1998). Real life information retrieval: a study of user queries on the Web. SIGIR Forum, 32(1), 5-17 Retrieved January 16, 2010 from http://portal.acm.org/citation.cfm?id=281250.281253.

 

Jansen, B. J., Spink, A., Blakely, C., & Koshman, S. (2007) Defining a session on Web search engines. Journal of the American Society for Information Science and Technology. 58(6): 862-871. http://www3.interscience.wiley.com/cgi- bin/fulltext/114130657/PDFSTART

 

Jansen, B.J., Spink, A., & Pedersen, J. (2005). Trend analysis of AltaVista Web searching. Journal of the American Society for Information Science and Technology, 56(6), 559–570.

 

Jones, S., Cunningham, S., & McNab, R. (1998). An analysis of usage of a digital library. Proceeding of Second European Conference on Digital Libraries (pp. 261–277).

 

Jörgensen, C. (1995). Image attributes: An investigation (Indexing systems, retrieval systems, computerized). Unpublished doctoral dissertation. Syracuse University.

 

Jörgensen, C. (1998). Attributes of images in describing tasks. Information Processing & Management, 34(2/3), 161–174.

 

Jörgensen, C., & Jörgensen, P. (2005). Image querying by image professional. Journal of the American Society for Information Science and Technology, 5(12), 1346- 1359. http://dx.doi.org/10.1002/asi.2022  

 

Krausse, M.G. (1998). Intellectual problems of indexing picture collections. Audiovisual Librarian, 14(2), 73 – 81.

 

Layne, S. S. (1994). Some issues in the indexing of images. Journal of the American Society for Information Science 45(8):583 – 588.

 

Layne, S. S. (2002). Subject Access to Art Images. In Introduction to Art Image Access: Issues, Tools, Standards, Strategies ed. Murtha Baca, Los Angeles: The Getty Research Institute, 8.

 

Lee, U., Liu, Z., & Cho, J. (2005). Automatic identification of user goals in Web search, Proceedings of the 14th international conference on World Wide Web (pp. 391- 400) http://portal.acm.org/citation.cfm?id=1060745.1060804 Chiba, Japan: ACM.

 

Markey, K. (2007b). Twenty-five years of end-user searching, part 2: Future research directions. Journal of the American Society for Information Science and Technology, 58(8):1123-1130.

 

Meister, D., & Sullivan, D. (1967). Evaluation of user reactions to a prototype on-line information retrieval system: Report to NASA by Bunker-Ramo Corporation. Report Number NASA CR-918. Oak Brook, IL: Bunker-Ramo Corporation

 

Montgomery, A.L., & Faloutsos, C. (2000). Trends and patterns of WWW browsing behavior. http://pages.cpsc.ucalgary.ca/saul/personal/other_pubs/web_trends.pdf

 

Murray, G.C., Lin, A., & Chowdhury, A. (2006). Identification of user sessions with hierarchical agglomerative clustering. In Proceedings of the ASIS&T Annual Meeting, 43(1): 1 – 9. http://dx.doi.org/10.1002/meet.14504301312

 

Özmutlu, H.C., & Çavdur, F. (2005). Application of automatic topic identification

on ExciteWeb search engine data logs. Information Processing & Management, 41, 1243–1262.

 

Özmutlu, H. C., Spink, A. & Özmutlu S. (2002). Analysis of large data logs: an application of Poisson sampling on excite web queries. Information Processing and Management, 38(4): 473-490. DOI: 10.1016/S0306- 4573(01)00043-7

 

Panofsky, E. (1939). Studies in iconology: Humanistic themes in the art of the Renaissance. New York: Oxford University Press.

 

Penniman, W. D. & Dominick, W. D. (1980). Monitoring and evaluation of on-line information system usage. Information Processing and Management, 16(1): 17-35.

 

Peters, T. (1993). The history & development of transaction log analysis. Library Hi Tech, 42(11), 41–66.

 

Radlinski, F., & Joachims, T. (2005). Query chains: learning to rank from implicit feedback, Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining (pp. 239-248) http://portal.acm.org/citation.cfm?id=1081870.1081899 Chicago, Illinois, USA: ACM.

 

Rose, D. E., & Levinson, D. (2004). Understanding user goals in web search, Proceedings of the 13th international conference on World Wide Web (pp. 13-19) http://portal.acm.org/citation.cfm?id=988672.988675 New York, NY, USA: ACM.

 

Shaaban, S., McKechnie, J., & Lockley, S. (2003). Modelling information seeking behavior of AEC professionals on online technical information resources. ITcon, 8, 265 – 281.

 

Shatford, S. (1986). Analyzing the subject of a picture: A theoretical approach. Cataloging & Classification Quarterly 6(3): 39 -62.

 

Shen, X., Tan, B., Zhai, C., (2005). Implicit user modeling for personalized search, in:

Proceedings of the CIKM, pp. 824–831.

 

Shi, X. and Yang, C.C. (2006). Mining related queries from search engine query logs, in: Proceedings of the 15th International Conference on World Wide Web,

2006, pp. 943–944.

 

Siegfried, S., Bates, M., & Wilde, D. (1993). A profile of end-user searching behavior by humanities scholars: The Getty online searching project report no. 2. Journal of the American Society for Information Science, 44(5), 273–291.

 

Silverstein, C., Marais, H., Henzinger, M., & Moricz, M. (1999). Analysis of a very large Web search engine query log. SIGIR Forum 33, 1(Sep. 1999), 6–12.

 

Spink, A., & Jansen, B. J. (2004). Web search: public searching on the web. Dorndrecht, Netherlands ; Boston : Kluwer Academic Publishers, 2004

 

Svenonius, E. (1994). Access to nonbook materials: The limits of subject indexing for visual and aural languages. Journal of theAmerican Society for Information Science 45(8), 600-606.

 

Thatcher, A. (2006). Information-seeking behaviours and cognitive search strategies in different search tasks on the WWW. International Journal of Industrial Ergonomics, 36(12), 1055-1068. Retrieved January 23, 2010 from http://dx.doi.org/10.1016/j.ergon.2006.09.012

 

Wang, P., et al (2006). Final report on ALISE/OCLC 2005 research grant: Mining web search behaviors: strategies and techniques for data modeling and analysis. Retrieved January 24, 2010 from

http://www.oclc.org/research/grants/reports/2005/wang-p.pdf

 

Wang, P., Wolfram, D., Zhang, J., Hong, N., Wu, L., Canevit, C., & Redmon, D. (2007). Mining Web search behaviors: Strategies and techniques for data modeling and analysis. In Proceedings of the 2007 Annual Meeting American Society for Information Science and Technology.

 

Wang, P., Berry, M., and Yang, Y. (2003). Mining longitudinal Web queries:

Trends and patterns. Journal of the American Society for Information Science and Technology, 54(8): 743-758. Retrieved January 25, 2010 from http://www3.interscience.wiley.com/cgi-bin/fulltext/104525889 /HTMLSTART

 

 

Wang, X., & Zhai, C. (2007). Learn from web search logs to organize search results, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval(pp. 87-94) http://portal.acm.org/citation.cfm?id=1277741.1277759 Amsterdam, The Netherlands: ACM.

 

Webb, E.J., Campbell, D.T., Schwartz, R.D., & Sechrest, L. (2000). Unobtrusive Measures (Revised Edition). Thousand Oaks, California: Sage.

 

Wen, J.R., Nie, J.Y., & Zhang, H.J. (2002). Query clustering using user logs. ACM Trans. Inf. Syst., 20(1), 59-81 Retrieved March 4, 2010 http://portal.acm.org/citation.cfm?id=503104.503108

 

White, R. W., & Drucker, S. M. (2007). Investigating behavioral variability in web search, Proceedings of the 16th international conference on World Wide Web (pp. 21-30) http://portal.acm.org/citation.cfm?id=1242572.1242576 Banff, Alberta, Canada: ACM.

 

Wolfram, D. (1999). Term co-occurrence in Internet search engine queries: An analysis of the Excite data set. Canadian Journal of Information and Library Science, 24(2/3), 12–33. Retrieved February 6, 2010 from http://www.caisacsi.ca/proceedings/1999/Wolfram_1999.pdf.

 

Wolfram, D. (2000). A query-level examination of end user searching behaviour on the excite search engine. In H. Olson, (Ed.). Proceedings of the 28th Annual Conference of the Canadian Association for Information Science. Retrieved February 5, 2010 from http://www.caisaci.ca/proceedings/2000/wolfram_2000.pdf.

 

Wolfram, D., Wang, P., & Zhang, J. (2009). Identifying Web search session patterns using cluster analysis: A comparison of three search environments. Journal of the American Society for Information Science and Technology, 60(5), 896-910 Retrieved February 17, 2010 from http://www3.interscience.wiley.com/journal/121675939/abstract

 

Yun, G. W. (2009). The unit of analysis and the validity of web log data. In B. Jansen, A. Spink & I. Taksa (eds.), Handbook of Research on Web Log Analysis (pp. 165-180). Hershey, PA: Information Science Reference.

 

Yun, G. W., Ford, J., Hawkins, R. P., Pingree, S. & McTavish, F. (2006). On the validity of client-side vs. server-side web log data analysis. Internet Research 16(5), 537-552. Retrieved March 5, 2010 from www.emeraldinsight.com/10.1108/10662240610711003

 

 

 

1 For a better description of Dempster-Shafer theory, please see Shafer, G. 1992. The Dempster-Shafer theory. Encyclopedia of Artificial Intelligence, 2nd edition, Stuart C. Shapiro (ed.). Wiley, pp. 330–331 http://www.glennshafer.com/assets/downloads/articles/article48.pdf

 

2 The Zipf model, though hotly debated as to its significance, is loosely a method of determining the relative obscurity of a word (including its definition) by ranking words from most frequent to least frequent where the least frequent words are deemed most obscure. The Mandelbrot Zipf sought to correct the deviations in very frequent and very infrequent terms. Explanations of these two models can be read in Manin (2009). The Generalized Waring Process is a model that allows for analysis of extremely complex interactions of factors such as those in weather, economics, and accident rates (Xekalaki and Mimoza, 2008).

 

DRAFT PRE-PUBLICATION COPY - 31 -

ARTstor Transaction Log Analysis Literature Review Heather Lowe

Comments (0)

You don't have permission to comment on this page.