Social Computing

Vol. 3 No. 9 – November 2005

Social Computing

Articles

Information Extraction: Distilling Structured Data from Unstructured Text

In 2001 the U.S. Department of Labor was tasked with building a Web site that would help people find continuing education opportunities at community colleges, universities, and organizations across the country. The department wanted its Web site to support fielded Boolean searches over locations, dates, times, prerequisites, instructors, topic areas, and course descriptions. Ultimately it was also interested in mining its new database for patterns and educational trends. This was a major data-integration project, aiming to automatically gather detailed, structured information from tens of thousands of individual institutions every three months.

by Andrew McCallum

Social Bookmarking in the Enterprise

One of the greatest challenges facing people who use large information spaces is to remember and retrieve items that they have previously found and thought to be interesting. One approach to this problem is to allow individuals to save particular search strings to re-create the search in the future. Another approach has been to allow people to create personal collections of material—for example, the use of electronic citation bundles (called binders) in the ACM Digital Library. Collections of citations can be created manually by readers or through execution of (and alerting to) a saved search.

by David Millen, Jonathan Feinberg, Bernard Kerr

Threads without the Pain

Much of today’s software deals with multiple concurrent tasks. Web browsers support multiple concurrent HTTP connections, graphical user interfaces deal with multiple windows and input devices, and Web and DNS servers handle concurrent connections or transactions from large numbers of clients.

by Andreas Gustafsson

Fighting Spam with Reputation Systems

Spam is everywhere, clogging the inboxes of e-mail users worldwide. Not only is it an annoyance, it erodes the productivity gains afforded by the advent of information technology. Workers plowing through hours of legitimate e-mail every day also must contend with removing a significant amount of illegitimate e-mail. Automated spam filters have dramatically reduced the amount of spam seen by the end users who employ them, but the amount of training required rivals the amount of time needed simply to delete the spam without the assistance of a filter.

by Vipul Ved Prakash, Adam O'Donnell

Interviews

A Conversation with Ray Ozzie

There are not many names bigger than Ray Ozzie's in computer programming. An industry visionary and pioneer in computer-supported cooperative work, he began his career as an electrical engineer but fairly quickly got into computer science and programming. He is the creator of IBM's Lotus Notes and is now chief technical officer of Microsoft, reporting to chief software architect Bill Gates. Recently, Ozzie's role as chief technical officer expanded as he assumed responsibility for the company's software-based services strategy across its three major divisions.

Curmudgeon

Stop Whining about Outsourcing!

I’m sick of hearing all the whining about how outsourcing is going to migrate all IT jobs to the country with the lowest wages.

The paranoia inspired by this domino theory of job migration causes American and West European programmers to worry about India, Indian programmers to worry about China, Chinese programmers to worry about the Czech Republic, and so on. Domino theorists must think all IT jobs will go to the Republic of Elbonia, the extremely poor, fourth-world, Eastern European country featured in the Dilbert comic strip.

by David Patterson

Kode Vicious

Kode Vicious: The Doctor is In

Dear Kode Vicious, I've been reading your rants for a few months now and was hoping you could read one of mine. It's a pretty simple rant, actually: it's just that I'm tired of hearing about buffer overflows and dont understand why anyone in his or her right mind still uses strcpy(). Why does such an unsafe routine continue to exist at all? Why not just remove the thing from the library and force people to migrate their code? Another thing I wonder is, how did such an API come to exist in the first place?

by George Neville-Neil