January/February 2018 issue of acmqueue

The January/February issue of acmqueue is out now

Semi-structured Data

  Download PDF version of this article PDF

ITEM not available


Originally published in Queue vol. 3, no. 8
see this item in the ACM Digital Library



Andrew McCallum - Information Extraction
Distilling structured data from unstructured text

Natalya Noy - Order from Chaos
There is probably little argument that the past decade has brought the “big bang” in the amount of online information available for processing by humans and machines. Two of the trends that it spurred (among many others) are: first, there has been a move to more flexible and fluid (semi-structured) models than the traditional centralized relational databases that stored most of the electronic data before; second, today there is simply too much information available to be processed by humans, and we really need help from machines.

C. M. Sperberg-McQueen - XML
XML, as defined by the World Wide Web Consortium in 1998, is a method of marking up a document or character stream to identify structural or other units within the data. XML makes several contributions to solving the problem of semi-structured data, the term database theorists use to denote data that exhibits any of the following characteristics:

Adam Bosworth - Learning from the Web
In the past decade we have seen a revolution in computing that transcends anything seen to date in terms of scope and reach, but also in terms of how we think about what makes up “good” and “bad” computing. The Web taught us several unintuitive lessons:


(newest first)

Leave this field empty

Post a Comment:

© 2018 ACM, Inc. All Rights Reserved.