A Conversation with Matt Wells:
When it comes to competing in the search engine arena, IS bigger always better?
Search is a small but intensely competitive segment of the industry, dominated for the past few years by Google. But Google’s position as king of the hill is not insurmountable, says Gigablast’s Matt Wells, and he intends to take his product to the top.
Web Search Considered Harmful:
The top five reasons why search is still way too hard
Nowadays, when you find yourself utterly disgusted by “American Idol,” or any other of the latest “reality” shows on TV, you may decide, “What the heck, time to seek a slightly less horrible form of punishment: let’s get on the Web.”
Searching vs. Finding:
Why systems need knowledge to find what you really want
Finding information and organizing it so that it can be found are two key aspects of any company’s knowledge management strategy. Nearly everyone is familiar with the experience of searching with a Web search engine and using a search interface to search a particular Web site once you get there. (You may have even noticed that the latter often doesn’t work as well as the former.) After you have a list of hits, you typically spend a significant amount of time following links, waiting for pages to download, reading through a page to see if it has what you want, deciding that it doesn’t, backing up to try another link, deciding to try another way to phrase your request, et cetera.
Enterprise Search: Tough Stuff:
Why is it that searching an intranet is so much harder than searching the Web?
The last decade has witnessed the growth of information retrieval from a boutique discipline in information and library science to an everyday experience for billions of people around the world. This revolution has been driven in large measure by the Internet, with vendors focused on search and navigation of Web resources and Web content management. Simultaneously, enterprises have invested in networking all of their information together to the point where it is increasingly possible for employees to have a single window into the enterprise.
Why Writing Your Own Search Engine Is Hard:
Big or small, proprietary or open source, Web or intranet, it’s a tough job.
There must be 4,000 programmers typing away in their basements trying to build the next “world’s most scalable” search engine. It has been done only a few times. It has never been done by a big group; always one to four people did the core work, and the big team came on to build the elaborations and the production infrastructure. Why is it so hard? We are going to delve a bit into the various issues to consider when writing a search engine. This article is aimed at those individuals or small groups that are considering this endeavor for their Web site or intranet.
Building Nutch: Open Source Search:
A case study in writing an open source search engine
Search engines are as critical to Internet use as any other part of the network infrastructure, but they differ from other components in two important ways. First, their internal workings are secret, unlike, say, the workings of the DNS (domain name system). Second, they hold political and cultural power, as users increasingly rely on them to navigate online content.
From IR to Search, and Beyond:
Searching has come a long way since the 60s, but have we only just begun?
It’s been nearly 60 years since Vannevar Bush’s seminal article, ’As We May Think,’ portrayed the image of a scholar aided by a machine, “a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility.”
Search Considered Integral:
A combination of tagging, categorization, and navigation can help end-users leverage the power of enterprise search.
Most corporations must leverage their data for competitive advantage. The volume of data available to a knowledge worker has grown dramatically over the past few years, and, while a good amount lives in large databases, an important subset exists only as unstructured or semi-structured data. Without the right systems, this leads to a continuously deteriorating signal-to-noise ratio, creating an obstacle for busy users trying to locate information quickly. Three flavors of enterprise search solutions help improve knowledge discovery.
Discrimination in Online Ad Delivery:
Google ads, black names and white names, racial discrimination, and click advertising
Do online ads suggestive of arrest records appear more often with searches of black-sounding names than white-sounding names? What is a black-sounding name or white-sounding name, anyway? How many more times would an ad have to appear adversely affecting one racial group for it to be considered discrimination? Is online activity so ubiquitous that computer scientists have to think about societal consequences such as structural racism in technology design? If so, how is this technology to be built? Let’s take a scientific dive into online ad delivery to find answers.
10 Optimizations on Linear Search:
The operations side of the story
System administrators (DevOps engineers or SREs or whatever your title) must deal with the operational aspects of computation, not just the theoretical aspects. Operations is where the rubber hits the road. As a result, operations people see things from a different perspective and can realize opportunities outside of the basic O() analysis. Let’s look at the operational aspects of the problem of trying to improve something that is theoretically optimal already.