Why desktop search will give way to personal information search

February 7, 2008 under information management, pim

A few years ago, we were told that desktop search applications were going to change the face personal information management. Google Desktop was released in late 2004, only a few months before Apple introduced Spotlight as a key feature of the new version of OS X. When Windows Vista finally shipped, it included a similar feature called Instant Search. As Google brings order to the billions of pages on the web, desktop search was supposed to bring order to your files, emails, and photos.

Now, more than 3 years later, have things really changed?

I’ve been focusing on desktop search in my master’s research, and I’ve noticed that not many people are actually using these tools. Even though I am doing research in the area, I often find myself resorting to the tried-and-true hierarchical file system.

Part of the problem is that the search algorithms pretty much suck. Remember web search before Google? When the highest-ranked page was the one that contained the most repetitions of your keywords? Desktop search apps suffer from similar problems. The algorithm doesn’t know that a file I created should rank higher than some sample code that came with Python. It doesn’t realize that a message I received from a mailing list is less relevant than the email from my supervisor. We don’t yet have the equivalent of PageRank on the desktop.

Another reason desktop search hasn’t taken over is that the problem has changed. Really, it’s not about desktop search. It’s about personal information search. I mean personal information in the sense of personal information management — the information items (files, emails, IM conversations, bookmarks, etc.) that we use in our day-to-day tasks. What has changed in the past few years is that more and more of this information is stored in web applications. This presents a challenge for desktop search applications. Google Desktop can search GMail, but that’s an exception — most desktop search applications are restricted to searching things on your computer. I need something that will search GMail, my Facebook inbox, my Flickr and Facebook photos, wiki pages, Backpack, etc.

But while web applications are the downfall of desktop search, they are also the reason why we need personal information search. With our important data being stored in so many different places, each with its own particular organization methods, we don’t really have another alternative.

What do you think? Are you using desktop search? What’s your preferred application, and what do you love and hate about it?


Design *for* our brains, not *like* our brains

November 29, 2007 under design, information management, the brain, hci

Human brain A few days ago, I came across an article called The Second Coming — A Manifesto by David Gelernter. Gelernter is famous for being a co-inventor of LifeStreams, which was a really cool PIM system based on time-order streams of documents.

In The Second Coming, written in 2000, Gelernter writes about a coming revolution in computing:

Computing will be transformed. It’s not just that our problems are big, they are big and obvious. It’s not just that the solutions are simple, they are simple and right under our noses. It’s not just that hardware is more advanced than software; the last big operating-systems breakthrough was the Macintosh, sixteen years ago, and today’s hottest item is Linux, which is a version of Unix, which was new in 1976. Users react to the hard truth that commerical software applications tend to be badly-designed, badly-made, incomprehensible and obsolete by blaming themselves.

He dedicates an entire section of the essay to the problems he sees with the current file-and-folder organizational model:

27. Modern computing is based on an analogy between computers and file cabinets that is fundamentally wrong and affects nearly every move we make. (We store “files” on disks, write “records,” organize files into “folders” — file-cabinet language.) Computers are fundamentally unlike file cabinets because they can take action.

[…]

30. If you have three pet dogs, give them names. If you have 10,000 head of cattle, don’t bother. Nowadays the idea of giving a name to every file on your computer is ridiculous.

31. Our standard policy on file names has far-reaching consequences: doesn’t merely force us to make up names where no name is called for; also imposes strong limits on our handling of an important class of documents — ones that arrive from the outside world. A newly-arrived email message (for example) can’t stand on its own as a separate document — can’t show up alongside other files in searches, sit by itself on the desktop, be opened or printed independently; it has no name, so it must be buried on arrival inside some existing file (the mail file) that does have a name.

I totally agree with the points he makes. These are things I’ve been complaining about for years, too.

Gelernter then goes on to describe (at a very high level) the organizational model that we should be using on computers:

36. File cabinets and human minds are information-storage systems. We could model computerized information-storage on the mind instead of the file cabinet if we wanted to.

37. Elements stored in a mind do not have names and are not organized into folders; are retrieved not by name or folder but by contents. (Hear a voice, think of a face: you’ve retrieved a memory that contains the voice as one component.) You can see everything in your memory from the standpoint of past, present and future. Using a file cabinet, you classify information when you put it in; minds classify information when it is taken out. (Yesterday afternoon at four you stood with Natasha on Fifth Avenue in the rain — as you might recall when you are thinking about “Fifth Avenue,” “rain,” “Natasha” or many other things. But you attached no such labels to the memory when you acquired it. The classification happened retrospectively.)

Our minds are extraordinarily complicated things. Should we really be building software that is modeled on that kind of complexity?

Modeling machines after nature is rarely the best approach. Our airplanes don’t have flapping wings, and cars and bicycles are not “running machines.” You can also think of spoken and written languages as “tools”, ones that have an intimate connection with our thought processes. If languages were modeled on the way the mind works, we would be speaking in sentence fragments, and constantly making up new words to easily refer to concepts and past events. Would anyone argue that languages could be made better by making them more flexible, more malleable, and a better match for our internal thought processes?

To me, modeling computers on our minds is just as much of a red herring as modeling them on file cabinets. Let’s build software for how our brains work, not like how our brains work. The best tools are the ones that support and compliment our natural abilities. My brain doesn’t have an internal calendar or to-do list, but those turn out to be remarkably simple and effective constructs that support my goals of accomplishing certain tasks. They are effective because of how simple and straightforward they are, and because they allow my brain to focus on what it does best (which is not remembering absolute times or lists of items).

(Brain photo by Gaetan Lee on Flickr)


There are no little boxes: Everything is deeply intertwingled

October 23, 2007 under information management

The post yesterday on Information R/evolution reminded me of a concept that I ran across not too long ago. Ted Nelson, who coined the word hypertext (among other things), introduced the concept of intertwingularity in his 1974 book “Computer Lib/Machine Dreams”:

EVERYTHING IS DEEPLY INTERTWINGLED. In an important sense there are no “subjects” at all; there is only all knowledge, since the cross-connections among the myriad topics of this world simply cannot be divided up neatly.

Hierarchical and sequential structures, especially popular since Gutenberg, are usually forced and artificial. Intertwingularity is not generally acknowledged — people keep pretending they can make things hierarchical, categorizable and sequential when they can’t.

In the comments on my post yesterday, Chris said:

I know we’re not stuck in categories anymore.

But, tags have never struck me as the ‘answer’. I know they’re doing good things. I know they allow interesting ways to view information. But, I’m not sure that tagging information is making it easier for me to get my hands on.

Maybe I just don’t grok it. Maybe because I can’t see the light, I don’t really put enough effort into tagging my own information properly.

Chris is right: tags aren’t “the answer.” Tags are just another way of dividing the world up into neat little boxes. But I think the main point of the video, and the thesis of Everything is Miscellaneous is that there are no little boxes. Everything is deeply intertwingled.


On a side note: Ted Nelson is one interesting character. He strikes me as being a bit like Richard Stallman: a visionary, and someone who deserves respect for sticking to his ideals, but a bit of a nutbar. There’s an interesting article in the Wired archives about Nelson and his yet-to-be-realized Xanadu hypertext system.

I also noted from the Wikipedia article that Nelson coined the term teledildonics. I laughed out loud when I read the next sentence: “The main thrust of his work…”


Information Revolution

October 22, 2007 under information management

Anand pointed me to another cool video from Michael Wesch, an Assistant Professor of Cultural Anthropology at the Kansas State University. It’s a short video called Information R/evolution that explores the consequences of the shift from paper-based information to digital information, with nods to Clay Shirky’s Ontology is Overrated and David Weinberger’s Everything is Miscellaneous. The style of the video is incredibly cool. Check it out:

If you didn’t see his last video — Web 2.0 … The Machine is Us/ing Us, which made the rounds in February — it’s definitely worth checking out as well.


Sometimes it’s okay to be sucky

August 20, 2007 under design, usability, information management, hci

I’ve written before about the paradox of choice — the concept that having more choice does not in fact lead us to be happier and more fulfilled. More choice leads us to worry and waste time making decisions that aren’t really that important in the grand scheme of things. This applies to software too, which is why I’ve always appreciated software that keeps the preferences panel nice and simple.

The paradox of choice isn’t a mystery. Really, it’s a simple trade-off: I get to put less effort into making the decision, which makes up for the fact that I might not get exactly what I want. Lately I’ve been thinking about how this same concept applies to personal information management.

We all know that keeping your digital documents organized is a lot of work, but you make the effort now so that you can find what you need later. At a certain point, you might wonder, “Couldn’t the computer automatically organize this stuff for me?” Desktop search is a step in this direction, but it only works if you can conjure up the right keywords. Sometimes, you don’t know what you’re looking for until you see it (or, in information foraging theory, until you get a whiff of its delicious scent).

What about automated classification? Your computer could recognize that a bunch of documents are similar, and group them together. The problem is that existing systems just aren’t very good at doing this, especially for personal information. Because of this, people have tended to steer clear of the idea of automated classification in PIM software.

But you know what? Maybe it’s okay that the automated classification sucks. If you can design it so that it degrades gracefully, so that it’s suckiness doesn’t get in your way too much, then maybe it would be a fair trade-off. Like your favourite dog that’s always getting into the garbage, you could just shake your head and chuckle at its stubbornness and naïveté. As with the paradox of choice, I’m willing to put up with something that’s not quite what I wanted, as long as it’s a net win for me overall.


Do you use spatial organization?

July 31, 2007 under information management, hci

Today Anand and I were talking about how people manage files on their (computer) desktop, and we were wondering — how many people actually use some sort of spatial organization? By “spatial organization”, I mean that the physical location of the icons on the desktop carries some sort of meaning for the user. Since he created BumpTop, there’s no doubt that Anand wants to treat his desktop like a physical workspace, putting similar documents close to each other, even creating piles. On the opposite side of the spectrum, here’s what my desktop looks like:

Screenshot of my desktop

One reason I don’t use spatial organization on my desktop is that I don’t like to see the clutter there all the time. Another reason is that Windows doesn’t give you the features to encourage this kind of approach. Maybe if I had BumpTop, my approach would be different.

There are pros and cons to using a real-world metaphor. It’s great to leverage a person’s existing skills, and metaphors can be a great way to help people learn and understand new concepts. As the Spanish philosopher José Ortega y Gasset said: “the metaphor is perhaps one of man’s most fruitful potentialities. Its efficacy verges on magic.” On the other hand, metaphors can be limiting: they discourage understanding the true nature of a new object or concept, and they prevent us from seeing better ways of doing things. In The Anti-Mac Interface, the authors mention an early tractor design that was steered using reins: “the tractor was steered by pulling on the appropriate rein, both reins were loosened to go forward and pulled back to stop, and pulling back harder on the reins caused the tractor to back up.”

I wanted to put the question out to everyone: do you use spatial organization on your computer? Why or why not?


Icons by Picasso

July 7, 2007 under design, information management, the brain, hci

I was cruising around over on the Mozilla Labs site, and found a cool proposal about how to make it easier to keep track of tabs in Firefox. Chromatabs associates a specific colour with each site, and then colours all the tabs from that site in the same colour, which lets you use your innate ability to recognize colour (for most of us, anyways) to easily distinguish between the tabs.

This reminded me of an idea I had a long time ago — let’s call it cubist thumbnails. In Picasso’s Violin and Grapes, he represents a violin using its characteristic forms, but without painting the entire object itself. We see the curves and immediately recognize it as a violin even though it’s only fragments of a violin.

Picasso's Violin and Grapes

In the same way, my idea is to capture the essence of a web page in a small thumbnail. Instead of a snapshot of the page itself, a cubist thumbnail would be an abstract icon containing the characteristic colours and forms of a web page. For example:

Digg Google Slashdot

Compare these to screenshots of the same pages:

Digg Google Slashdot

I think it’s much easier to identify the sites using the first set of icons.

For text, it would be possible to do a similar thing by identifying the most common words phrases used on the text, with a focus on ones that are particular to that document, kind of like Amazon’s Statistically Improbable Phrases. Slashdot might be “nerds news matters microsoft linux cmdrtaco”.


Personal Information Beyond the Desktop

May 18, 2007 under information management, hci

I’ve been reading “Beyond the Desktop Metaphor: Designing Integrated Digital Work Environments“. The title is pretty self-explanatory I think. The book touches on some interesting projects in the PIM space, such as Lifestreams and Haystack. In the final chapter, they talk about the idea of the “personal information cloud”. More and more of the information that we interact with is on the network, and this doesn’t fit very well into the traditional desktop metaphor, where things have the illusion of having a “physical location” (in a specific folder, or on the desktop, etc.):

As users disperse and “destructure” their personal information, there is less need for the desktop/office metaphor to be the organizer of the information. We believe that the metaphor is being replaced by more abstract and sophisticated organizers, based on over a decade of experience by millions of people with information technology. Thus let us use the term Personal Information Cloud to refer to the “working set” of information that is relevant to the individual and his work.

This sort of concept is almost exactly what I’ve been thinking about for a while. When you have an idea, it’s nice to see that other people are thinking the same way. Especially when it’s really, really smart people. At the same time, it’s a little disappointing to realize that you’re not as original as you think. You know?

Anyways, they go on to list several requirements for this concept of a personal information cloud to be useful:

  1. Personal. It should contain most if not all information that is relevant to the individual and his activities.
  2. Persistent. It should be preserved.
  3. Pervasive. It should be always accessible from a variety of devices, programs, and services, i.e., it “follows the individual”.
  4. Secure. The information should be secure and private at an appropriate level. This is a significant issue when information is not held locally (although having information locally is not in itself assurance of privacy in a networked world).
  5. Referenceable. Each information object in the cloud should ideally have a unique ID (or permalink) and support a protocol for retrieval.
  6. Standardized. The information needs to be in standard formats to that it is usable by a variety of devices, programs, and services.
  7. Semantic. The cloud should be based on an extensible scheme of semantically rich metadata, so that it can be understood by a variety of programs and services in different contexts.

I’ve actually got all of these points scrawled down somewhere in my notebook. I think the ideas of persistence, pervasiveness, and referenceability are especially important, because that’s where most existing PIM solutions are lacking. Most PIM software is intended to be used on a single machine, which is obviously bad for pervasiveness and persistence. And web applications like GMail, although they are pervasive, can’t really be depended on to be persistent. Does Google guarantee that they will never delete my email? What happens if I want to move to a different provider? What if Mountain View sinks into the Pacific Ocean? And finally, referenceability is especially missing from most PIM software. Just like every web page has a URL, imagine if every piece of your personal information did too. As Hans Reiser would tell you, this greatly increases the expressive power of the information system.

See also: Aza Raskin’s Death of the Desktop presentation


Constipated Metaphors

May 15, 2007 under information management, hci

I just finished reading “The Psychology of Personal Information Management“, written by Mark Lansdale in 1988. I love reading old papers like this, because they are either comically inaccurate, like the 1950s Popular Science predictions (”in the year 2000, everyone will have a personal robot butler”), or else shockingly prescient (e.g. Vannevar Bush’s “As We May Think“, which basically predicted the web and digital cameras).

In the paper, Lansdale makes an interesting comment about the use of real-world metaphors in designing user interfaces. Many interfaces have been inspired by observing people’s behaviours in the real world: “Hmmm, people organize paper documents into folders, so they must want to do the same for digital documents!” Lansdale makes the point that we must try to understand the psychological reasons behind the action, because the action may be a coping strategy rather than an actual need. In other words, people may only use a strategy because of the limitations of the technology — just because something is done with paper documents doesn’t mean people want to do it with digital documents.

No one would suggest the introduction of unstructured `piles’ of documents in a computer environment. (I say this with the thought that somewhere someone probably has, much in the way that someone thought of building planes that flapped their wings.)

This is pretty funny in retrospect, because several people have actually done this. Not to judge those efforts; I just liked Lansdale’s metaphor, and generally agree with his point that real-world metaphors are not necessarily desirable.

Incidentally, I loved this quote from the Register’s article on piles:

Although haemorrhoids give millions of sufferers discomfiture every day, Apple’s “piles” are an intriguing concept which should ease the pain of using such a constipated file and folder UI metaphor.


Google is not enough

May 9, 2007 under information management, the brain, hci

I started at U of T this week, and I’m trying to narrow down my thesis topic, so I’m basically doing a swan dive into a pile of papers related to personal information management. Today I’ve read a couple interesting ones about how we use keyword search, and why even the most perfect search engine probably won’t ever replace the use of browsing to find information.

In the first paper, Don’t Take My Folders Away!, the researchers surveyed a small group of people to see how they used folders to organize the files on their computer. When the users were asked why they created folders, the answer was generally “to get back my files”. But when they were asked if they would give up their folders and find their information exclusively using a search engine, the answer was a resounding “no”. The conclusion of the paper was that folders are not just a way to organize information — the folder structure itself actually contains information about a project. For example, a folder structure can indicate subprojects and subtasks of a project.

Compass The second paper, “The Perfect Search Is Not Enough“, investigated how people performed searches both on the web and in their personal information. They found that people rarely searched for the specific item they were looking for; instead, they moved in small, local steps, using context to guide them. For example, rather than directly searching for a professor’s phone number (using a query like “david karger phone number”), they would search for the professor’s web page (maybe even by looking it up in a staff directory) and then try to find the information that way. The authors referred to this concept as “orienteering”:

Orienteering involves using both prior and contextual information to narrow in on the actual information need, often in a series of steps, without specifying the entire information need up front

Together, I think these papers demonstrate that while search is a useful way to find information, most people still need to use some kind of browsing to find the information they are looking for. The orienteering paper suggested that this kind of approach lessens the cognitive burden on the user. I think it really makes sense if you think of a real-life metaphor: you know how to find your way to a certain pub, but you don’t know the address and couldn’t give precise instructions about how to get there. Humans are naturally good at this kind of incremental way-finding, and browsing is a good way to take advantage of that.

(Photo by AlbeJTD on Flickr)


Next Page »