Abstract. TagCrowd.com is an application created in 2006 by Daniel Steinbock that renders the word frequencies of a submitted text in a “tag cloud.” The cloud TagCrowd creates is not interactive, but an aesthetic object in itself. Steinbock, a graduate student at Stanford University at the time, developed TagCrowd.com to provide a free and easy way to visualize word frequencies. There are a number of uses for the tag cloud, according to Steinbock, including “brand clouds” that show companies how their brands are perceived and name tags at conferences created from paper abstracts for networking.
Description. TagCrowd.com creates tag clouds of user-supplied texts by analyzing word frequencies and rendering words in alphabetical order as larger or smaller depending on their frequency. The visualization is thus twofold—the words are rearranged into alphabetical order and at the same time the words that occur more frequently are larger. Visualizing a text like this defamiliarizes it in a familiar way. Although tag clouds are typically used to allow people to click on frequent tags in a site, Steinbock claims that there are many more uses for clouds. For example, “brand clouds” that show companies how their brands are perceived, and “data mining a text corpus.” TagCrowd provides user-controlled parameters of the analysis. For example, a user can select whether she would like to display 50 or a 100 words, choose from a “stoplist” words that should be left out of the analysis, decide whether or not to show the number of occurrences next to the word (adding another layer of visualization), group similar words, and ignore common English words such as “the.” These options provide for a measure of control over the analysis.
TagCrowd.com is only one of the tag cloud tools available free online right now. Another is the tag cloud generator available through IBM’s Many Eyes. Unlike TagCrowd, a user must first register for Many Eyes to use the tool and her data becomes public once it is uploaded. However, these extra restrictions do come with benefits. The cloud is interactive—as one hovers over a word, for example, a box pops up that shows different occurrences of that word in context. The Many Eyes cloud thus does not take one through the text, as typical navigational clouds do, but it does show something about the text. There are also some tag cloud tools available through TAPoR, and these usually work together with other tools to offer a variety of views of the user’s text.
Commentary. Steinbock notes that there are a number of uses for the tag cloud. However, as Steinbock has programmed it there are a number of limitations to his cloud as well. For example, TagCrowd allows the user to upload a Microsoft Word document as the base text, but it does not filter out the trash code in Word or RTF (see figure 1 below)(Note: a visit to the site on 27 February 2008 revealed to me that the user can now only upload text files). The static cloud also does not allow for change. As an experiment, I uploaded an essay I was working on into TagCrowd. After the experiment, I edited my essay because I noticed there were some words that occurred numerous times but should not have. The cloud remains the same after my edits unless I upload the essay again. Jim Bumgardner of ON Lamp.com, in his “Design Tips” claims that dynamism is important to keeping tag clouds “relevant” for users. (N1)
Because our project centers around literary and student texts, I am most interested in the possibilities for tag clouds and literary analysis. There is at least one blogger who sees a new kind of reading in the cloud. Joe Lamantia, writing about what he calls the “text cloud,” claims, “the growing use of text clouds hints at a (potential) deeper cultural shift in the way we go about reading and comprehension: a shift from linear modes based on reading words and sentences, to nonlinear modes based on viewing summaries of content in aggregate as a way of discovering concepts and patterns.” (N2) The text cloud captures the non-linear elements of a text and makes them clear to the reader.
A twist on the text cloud is the word tree tool that IBM’s Many Eyes provides. This tool is also based on word frequency, but it does not stop there. The tree links words to the words they occur with. Re-creating the sentences that Lamantia would like to see gone, but in a new and interesting way, the word tree reconstructs the semantics of the text. Like Many Eyes’s tag clouds, the word tree places the words back in their context. This is especially useful when looking at seemingly meaningless words. For instance, I loaded “A Young Man’s Opinion” (Pepys 1.230-231) into the word tree and the result was very interesting. If you type in the word “she” or “woman” in the application below (see figure 2 below), you will find all of the words that occur after it in the text.
What Joe Lamantia referred to as “Second Generation Tag Clouds” seems to be coming into being. (N3) These clouds, according to Lamantia, “will fill a gap in our collective visualization toolset.” (N4) The clouds will show not only frequency, but frequency in the context of a network of associations. They will be used not only for branding or an at-a-glance understanding of written materials, but for reading in new ways. It seems to me that the future of the tag cloud lies in an elongation of its reading. As it stands now, the cloud summarizes a text, but its future seems to be illuminating a text and making our readings of it richer.
Someone once claimed that the tag cloud is “the mullet of web 2.0,” and I would like to conclude my commentary by thinking about the mullet comparison. (N5) The hairstyle is one everyone loves to hate and mock; in the popular imagination it now stands for everything “Blue Collar Comedy.” That is, the ridicule of the mullet is linked to classism and regionalism. When commentators fret over the tag clouds’ “popularity,” they are really fretting about the expansion of users' access to the technology to make tag clouds. Thus tag clouds are positioned in much the same way as close reading was in Post-World War II America—they have the potential to open up texts to people who may not have otherwise had access in this way to them before.
Notes:
1. Jim Bumgardner “Design Tips,” ONLamp.com http://www.onlamp.com/pub/a/onlamp/2006/06/08/designing-tag-clouds.html 8 June 2006, Accessed 19 February 2008
2. Joe Lamantia, “Text Clouds: A New Form of Tag Cloud?” Joe Lamantia.com, http://www.joelamantia.com/blog/archives/tag_clouds/text_clouds_a_new_form_of_tag_cloud.html, 15 March 2007 Accessed 19 February 2008
3. Joe Lamantia, “Second Generation Tag Clouds,” Joe Lamantia.com, http://www.joelamantia.com/blog/archives/ideas/second_generation_tag_clouds.html February 23, 2006 05:34 PM Accessed 19 February 2008
4. Joe Lamantia, “Second Generation Tag Clouds,” Joe Lamantia.com, http://www.joelamantia.com/blog/archives/ideas/second_generation_tag_clouds.html February 23, 2006 05:34 PM Accessed 19 February 2008
5. This was either Garrick Schmitt or Jeffrey Zeldman , but there is not agreement. See Ian Kennedy, “What I learned at ad tech,” everwas, 2 May 2006, http://everwas.com/2006/05/what_i_learned_at_adtech.html Accessed 26 February 2008 and Jeffrey Zeldman, “Tag clouds are the new mullets,” zeldman.com, 19 April 2005, http://www.zeldman.com/daily/0405d.shtml Accessed 26 February 2008.
Resources for Further Study.
1. Abacci Books Online, http://www.abacci.com/books/default.asp, especially http://www.abacci.com/annotated/ , which has a text cloud for the Project Gutenberg texts, Accessed 19 February 2008
2. IBM's Many Eyes, especially “tag cloud guide”: http://services.alphaworks.ibm.com/manyeyes/page/Tag_Cloud.html Accessed 19 February 2008
3. Andrew Odewahn http://radar.oreilly.com/archives/2007/01/tag_cloud_of_wh.html O’Reilly Radar 18 January 2007 Accessed 19 February 2008
4. TagCrowd.com
Figure 1: Tag Cloud with Junk in it
Figure 2: Word Tree of "Young Man's Opinion"
Page Information
|
Wiki Information |
Recent PBwiki Blog Posts |