Thursday, August 28, 2008

The Internet's Biggest Security Hole

http://blog.wired.com/27bstroke6/2008/08/revealed-the-in.html

Revealed: The Internet's Biggest Security Hole
By Kim Zetter August 26, 2008 | 8:00:00 PMCategories: DefCon, Glitches and Bugs, Hacks and Cracks

Two security researchers have demonstrated a new technique to stealthily intercept internet traffic on a scale previously presumed to be unavailable to anyone outside of intelligence agencies like the National Security Agency.
The tactic exploits the internet routing protocol BGP (Border Gateway Protocol) to let an attacker surreptitiously monitor unencrypted internet traffic anywhere in the world, and even modify it before it reaches its destination.
The demonstration is only the latest attack to highlight fundamental security weaknesses in some of the internet's core protocols. Those protocols were largely developed in the 1970s with the assumption that every node on the then-nascent network would be trustworthy. The world was reminded of the quaintness of that assumption in July, when researcher Dan Kaminsky disclosed a serious vulnerability in the DNS system. Experts say the new demonstration targets a potentially larger weakness.
"It's a huge issue. It's at least as big an issue as the DNS issue, if not bigger," said Peiter "Mudge" Zatko, noted computer security expert and former member of the L0pht hacking group, who testified to Congress in 1998 that he could bring down the internet in 30 minutes using a similar BGP attack, and disclosed privately to government agents how BGP could also be exploited to eavesdrop. "I went around screaming my head about this about ten or twelve years ago.... We described this to intelligence agencies and to the National Security Council, in detail."
The man-in-the-middle attack exploits BGP to fool routers into re-directing data to an eavesdropper's network.
Anyone with a BGP router (ISPs, large corporations or anyone with space at a carrier hotel) could intercept data headed to a target IP address or group of addresses. The attack intercepts only traffic headed to target addresses, not from them, and it can't always vacuum in traffic within a network -- say, from one AT&T customer to another.
The method conceivably could be used for corporate espionage, nation-state spying or even by intelligence agencies looking to mine internet data without needing the cooperation of ISPs.
BGP eavesdropping has long been a theoretical weakness, but no one is known to have publicly demonstrated it until Anton "Tony" Kapela, data center and network director at 5Nines Data, and Alex Pilosov, CEO of Pilosoft, showed their technique at the recent DefCon hacker conference. The pair successfully intercepted traffic bound for the conference network and redirected it to a system they controlled in New York before routing it back to DefCon in Las Vegas.
The technique, devised by Pilosov, doesn't exploit a bug or flaw in BGP. It simply exploits the natural way BGP works.
"We're not doing anything out of the ordinary," Kapela told Wired.com. "There's no vulnerabilities, no protocol errors, there are no software problems. The problem arises (from) the level of interconnectivity that's needed to maintain this mess, to keep it all working."
The issue exists because BGP's architecture is based on trust. To make it easy, say, for e-mail from Sprint customers in California to reach Telefonica customers in Spain, networks for these companies and others communicate through BGP routers to indicate when they're the quickest, most efficient route for the data to reach its destination. But BGP assumes that when a router says it's the best path, it's telling the truth. That gullibility makes it easy for eavesdroppers to fool routers into sending them traffic.
Here's how it works. When a user types a website name into his browser or clicks "send" to launch an e-mail, a Domain Name System server produces an IP address for the destination. A router belonging to the user's ISP then consults a BGP table for the best route. That table is built from announcements, or "advertisements," issued by ISPs and other networks -- also known as Autonomous Systems, or ASes -- declaring the range of IP addresses, or IP prefixes, to which they'll deliver traffic.
The routing table searches for the destination IP address among those prefixes. If two ASes deliver to the address, the one with the more specific prefix "wins" the traffic. For example, one AS may advertise that it delivers to a group of 90,000 IP addresses, while another delivers to a subset of 24,000 of those addresses. If the destination IP address falls within both announcements, BGP will send data to the narrower, more specific one.
To intercept data, an eavesdropper would advertise a range of IP addresses he wished to target that was narrower than the chunk advertised by other networks. The advertisement would take just minutes to propagate worldwide, before data headed to those addresses would begin arriving to his network.
The attack is called an IP hijack and, on its face, isn't new.
But in the past, known IP hijacks have created outages, which, because they were so obvious, were quickly noticed and fixed. That's what occurred earlier this year when Pakistan Telecom inadvertently hijacked YouTube traffic from around the world. The traffic hit a dead-end in Pakistan, so it was apparent to everyone trying to visit YouTube that something was amiss.
Pilosov's innovation is to forward the intercepted data silently to the actual destination, so that no outage occurs.
Ordinarily, this shouldn't work -- the data would boomerang back to the eavesdropper. But Pilosov and Kapela use a method called AS path prepending that causes a select number of BGP routers to reject their deceptive advertisement. They then use these ASes to forward the stolen data to its rightful recipients.
"Everyone ... has assumed until now that you have to break something for a hijack to be useful," Kapela said. "But what we showed here is that you don't have to break anything. And if nothing breaks, who notices?"
Stephen Kent, chief scientist for information security at BBN Technologies, who has been working on solutions to fix the issue, said he demonstrated a similar BGP interception privately for the Departments of Defense and Homeland Security a few years ago.
Kapela said network engineers might notice an interception if they knew how to read BGP routing tables, but it would take expertise to interpret the data.
A handful of academic groups collect BGP routing information from cooperating ASes to monitor BGP updates that change traffic's path. But without context, it can be difficult to distinguish a legitimate change from a malicious hijacking. There are reasons traffic that ordinarily travels one path could suddenly switch to another -- say, if companies with separate ASes merged, or if a natural disaster put one network out of commission and another AS adopted its traffic. On good days, routing paths can remain fairly static. But "when the internet has a bad hair day," Kent said, "the rate of (BGP path) updates goes up by a factor of 200 to 400."
Kapela said eavesdropping could be thwarted if ISPs aggressively filtered to allow only authorized peers to draw traffic from their routers, and only for specific IP prefixes. But filtering is labor intensive, and if just one ISP declines to participate, it "breaks it for the rest of us," he said.
"Providers can prevent our attack absolutely 100 percent," Kapela said. "They simply don't because it takes work, and to do sufficient filtering to prevent these kinds of attacks on a global scale is cost prohibitive."
Filtering also requires ISPs to disclose the address space for all their customers, which is not information they want to hand competitors.
Filtering isn't the only solution, though. Kent and others are devising processes to authenticate ownership of IP blocks, and validate the advertisements that ASes send to routers so they don't just send traffic to whoever requests it.
Under the scheme, the five regional internet address registries would issue signed certificates to ISPs attesting to their address space and AS numbers. The ASes would then sign an authorization to initiate routes for their address space, which would be stored with the certificates in a repository accessible to all ISPs. If an AS advertised a new route for an IP prefix, it would be easy to verify if it had the right to do so.
The solution would authenticate only the first hop in a route to prevent unintentional hijacks, like Pakistan Telecom's, but wouldn't stop an eavesdropper from hijacking the second or third hop.
For this, Kent and BBN colleagues developed Secure BGP (SBGP), which would require BGP routers to digitally sign with a private key any prefix advertisement they propagated. An ISP would give peer routers certificates authorizing them to route its traffic; each peer on a route would sign a route advertisement and forward it to the next authorized hop.
"That means that nobody could put themselves into the chain, into the path, unless they had been authorized to do so by the preceding AS router in the path," Kent said.
The drawback to this solution is that current routers lack the memory and processing power to generate and validate signatures. And router vendors have resisted upgrading them because their clients, ISPs, haven't demanded it, due to the cost and man hours involved in swapping out routers.
Douglas Maughan, cybersecurity research program manager for the DHS's Science and Technology Directorate, has helped fund research at BBN and elsewhere to resolve the BGP issue. But he's had little luck convincing ISPs and router vendors to take steps to secure BGP.
"We haven't seen the attacks, and so a lot of times people don't start working on things and trying to fix them until they get attacked," Maughan said. "(But) the YouTube (case) is the perfect example of an attack where somebody could have done much worse than what they did."
ISPs, he said, have been holding their breath, "hoping that people don’t discover (this) and exploit it."
"The only thing that can force them (to fix BGP) is if their customers ... start to demand security solutions," Maughan said.

Friday, August 22, 2008

HCI Forum topic DISS 720

http://www.cc.gatech.edu/fce/ecl/projects/dejaVu/mm/index.html

Memory Mirror

There are particular household items that people use for one specific task (e.g. taking a pill, feeding the cat) that usually is simple and short to do. However, these tasks become difficult to remember doing when they are repeated often enough but not in a strict routine, so the memory confusion arises between the repeated episodes. Did we do this already today or was that yesterday or the day before or do we still need to do this today? A similar confusion arises between multiple care-takers. Is it my turn today to do this or is it your turn or has this already been taken care of?

Memory mirror reflects a period of time (e.g. 24 hours of a day). As we use an item, it is visually posted to the mirror as shown in figure 1, and is recorded in a history log. If we had already used an item, an episode mirror reflects details of the previous number of usages. The memory mirror also warns of possibly lost items that have yet to be returned.

The memory mirror system uses RFID (radio frequency identification) technology which is available yet expensive today. Each household item (e.g. medicine bottles, food containers) has a RFID tag on the bottom, and the designated storage area (e.g. medicine cabinet, key tray) has a RFID reader on the top. Each item is photographed and entered into the system's inventory. With this setup, the memory mirror system tracks the removal and return of each differently tagged object to and from the storage area.

HCI Forum topic DISS 720

http://www.surl.org/usabilitynews/101/pdf/Usability%20News%20101%20-%20Shrestha.pdf


This study examines eye movement patterns of users browsing or
searching a 1-column and 2-column news article on a web page. The results
show a higher number of fixations for information in the second column of an
article than for the same information in the lower portion of a single column. In
addition, the typical "F" pattern appeared in the left column of the 2-column
layout, but not in the right column. Users also fixated more on other page
elements, such as ads, when they were browsing than when they were
searching.

HCI Forum topic DISS 720

Here is an interesting website on agents.

http://agents.umbc.edu/agentnews/1997/08/

TechWire has an article Virtual Humans To Populate The Internet which describes Matsushita's recent announcement of 3-D computer graphics software for creating animated virtual humans for use over the Internet. Matsushita will demonstrate the technology at SigGraph '97 and will make a free beta version of a VRML 2.0 browser and contents available for downloading later in the month. Matsushita has submitted the technology to the VRML Consortium for consideration as an industry standard for 3-D animation.

Saturday, August 09, 2008

HCI Forum Topic DISS 720

from www.pixelcharmer.com

Cognitive Models for Web Design
Information Foraging Theory Applied

May 2002

Information foraging theory seeks to explain information-seeking behavior in humans. Central to its thesis is that information foraging is an exaptation of food foraging mechanisms, therefore models of optimal foraging theory developed by anthropologists and ecologists in the study of food foraging will help us understand foraging behavior in consumers of information. These models allow us to investigate foraging behavior in relation to particular environmental conditions and the constraints of foraging for information in a dynamic ecology.

Information foraging theory gives those researching user interaction with Web sites a way to examine user goals, their decision making processes and adaptations to the information access system environment. Researchers can then make use of this knowledge in assessing system and interface design. Most importantly to those charged with developing a web site, information foraging theory can then inform design. I will demonstrate and give examples of ways web developers can use information foraging theory to cultivate more attractive paths to richer patches of information on a web site by knowing their visitors' information diets, allowing users to take advantage of the paths created by others, and providing representations of content with a strong information scent.
New Set of Tools

Users assess the appropriateness of following a particular path on the Web by considering a representation, usually a textual description or graphic, of the distal content. Furnas (1997) explained that a representational object held a “residue” of what lay behind it. Residue was recast and refined by Pirolli (1997) as information “scent” and defined in Card et al. (2001) as a user’s “(imperfect) perception of the value, cost, or access path of information sources obtained from proximal cues, such as WWW links.” In the initial work by Pirolli and Card (1995) on information foraging, they defined the profitability of an information source “as the value of information gained per unit cost of processing the source.” Cost is defined in terms of time spent, resources utilized and opportunities that are lost when pursuing another particular strategy instead of others. (Russell, 1993)

In order to invent a new set of tools for informing design of information systems, Pirolli and his colleagues went on to develop a computational cognitive model of information foraging based on ACT-R. The originator of ACT-R, John Anderson, used a network model of knowledge to develop his architecture. It solves the network model problem of defining associations among nodes by representing knowledge in a proposition. Therefore, the ideas in a proposition reveal their relationships to each other by their placement following linguistic rules within the proposition. When one node is activated in the network model, then a related node is activated as well and so on, spreading activation among related nodes. As with other network models, where to stop with this spreading, or “degree of fan” is problematic. However, ACT-R is very useful for modeling user interaction in a task environment. (Reisberg, 2001 pp.253-262)

Pirolli also discussed an overall framework for studying human-computer interaction from an ecological and cognitive perspective by reiterating the levels of analysis for understanding an information processing system. Unlike Marr (1982) he breaks up the first level “what the device does and why,” so that his levels number four in total. The first level is adaptation, then knowledge, followed by the cognitive level and finally biological or the implementation level as termed by Marr. (Pirolli, 1997)

Within this structure, Pirolli developed the “adaptive control of thought in information foraging (ACT-IF)” to model optimal foraging in a large collection of texts. In particular, using the Scatter/Gather browser interface developed at Xerox PARC, they were able to model users following information scent. Spreading activation could be measured starting from a task query to the relevant information. The Scatter/Gather browser would communicate the contents of the text collection by clustering the topics of each into discrete related groups represented by snippets of text. Following the rules set forth in ACT-IF, one or more clusters were selected to be scattered (reclustered) in the Scatter/Gather browser into topically related groups until the task was complete. When ACT-IF could make accurate judgments about distal information, thereby activating the nodes from information goal to that piece of distal information that completed the task, the proximal representation was considered to have strong information scent. (Pirolli, 1997)

ACT-IF allows (simulated) users with different constraints to be tested interacting with variations on a design. Using ACT-IF can allow a greater number of design variations to be tested under more conditions than in traditional user testing. Comparing the results from actual user tests with similar tasks performed by ACT-IF can test its accuracy. In fact some comparisons were made, but they are few due to the laborious and time-consuming nature of handcoding each of the results from videotaped user tests. (Pirolli, 1997)

Assuming that real users will strive for the optimal foraging behavior seems at odds with the frequently observed problem solving strategy know as “satisficing.” In fact, the process of making decisions based on aspiration level seems to provide a better description of activity observed in real world user testing. (Krug, 2000, p.24) However, Pirolli briefly points out that “satisficing can often be characterized as localized optimization (e.g., hill climbing) with resource bounds and imperfect information as included constraints.” (Pirolli, 1999 p.645) In addition, David Ward, et al. examined the role of satisficing in food foraging theory and found it wasn't at odds with optimal foraging theory. (Ward, 1992)

ACT-IF is a very useful tool for examining possible designs for a large web site composed of many individual texts. However, it’s efficacy with collections of images and non-text representations of distal information has not been considered.

Another tool that has implications for design allows us to analyze user paths from information in web server logs. Although there are many pieces of software for computing statistics from web server logs, none of them allows us to extrapolate user goals. Pirolli and his collegues demonstrate a way to take surfing patterns and infer the associated information need of a given user. Users are then clustered together when similar needs are identified. Developers can then construct user types, or “user profiles” for a particular site. (Chi, 2001, Heer, 2000)

Inferring User Need by Information Scent (IUNIS) was the algorithm that allowed the development of a tool for building user profiles from surfing patterns. IUNIS identifies the documents that a user accessed during a browsing session and the order they were accessed. Applying the longest repeating subsequence (LRS) assists in extracting paths that are repeated by multiple users, and therefore more likely to be relevant to our task. Each of these repeated paths help us to describe a user profile. Vector distances between pages are calculated and distances between vectors are as well. Four modalities are then identified for each web page accessed in the path so we may cluster them.

1. each unique word in a page (however it is weighted as less significant if the word is found frequently in other pages on the site)
2. the directory location of that page as represented by forward slashes in its URL (page is given more weight if fewer other documents share the directory)
3. how many links from other pages on our site point to that page (weighted so that a link from a particular page is less significant if the same page points to several others)
4. all the links that go out from our page whether they only link to other pages on our site or not.

Once each of these modalities and vectors are identified for all of the pages within statistically significant paths, we have what is known as the CUT data of a site. Before completing our calculation, we weight the final pages in a path, or in other words, the pages more recently accessed so as not to give too much importance to gateway pages or splash pages that everyone must click through. We now have a representation of our site by multi-modal vector paths. We can cluster our pages and unlike prevalent web log analysis software that only analyzes one mode (number of hits to a page, number of links to a page, etc.) our multimodal representations make it possible for us to construct user profiles from our calculations. (Chi, 2001)

Web site developers design sites for specific user groups as identified by extensive marketing research and if feasible ethnographic study including contextual inquiry. Following the practice of these user-centered design methods, developers then construct hypothetical, archetypal users to build user profiles and inform design. However, by generating user profiles from analyzing the Web server logs of an existing site, future iterations of the site can better meet its users' needs without requiring its developers to conduct actual marketing surveys and contextual inquiries.

A third tool that can be investigated in regard to information foraging theory is collaborative filtering. Collaborative filtering allows users to forage for information in groups much like a group of humans banding together to hunt for food when objects included in their diet are distributed widely and thinly in their environment. By ascribing a history of use to a digital object, a single user can benefit from the foraging of others. Interaction history of other foragers or as described by Wexelblat (1999), “footprints ...allow users to leave traces in the virtual environment...” The interaction history of others, attached to an object can come from passive sources, such as access logs, or active sources, such as online papers that allow users to leave commentary. There are several shopping sites currently using both of these methods. For example, Amazon.com extracts user information from its logs so a single item’s description can also include items that other users viewed or bought when viewing or buying the current item. Amazon.com also uses active interaction history by soliciting user opinions of each product then attaching that interaction to the appropriate item. Potential consumers can then view and benefit from another user’s experience when making their purchasing decision.

As Card and Pirolli hoped, information foraging theory has already provided Web developers with new tools. Among these tools are ways to spread activation by using labels with strong information scent so that paths are more attractive and lead to richer patches of information. By efficiently constructing user profiles developers can know their users' information diet and increase the profitability of items in their diets by decreasing the amount of energy expended when foraging for desirable items. Allowing users to take advantage of the paths created by others through collaborative filtering, as Wexelblat (1999) demonstrated, leads to greater user satisfaction and greatly reduces the cost associated with foraging.

Card, Stuart K., Peter Pirolli, Mija Van Der Wege, Julie B. Morrison, Robert W. Reeder, Pamela K. Schraedley, Jenea Boshart (2001). Information scent as a driver of Web behavior graphs. Proceedings of the Conference on Human factors in computing systems CHI '01 Association for Computing Machinery.

Chi, Ed H. Peter Pirolli, Kim Chen, James Pitkow (2001). Using Information Scent to Model User Information Needs and Actions on the Web. In Proc. of ACM CHI 2001 Conference on Human Factors in Computing Systems, pp. 490-497. ACM Press, April 2001. Seattle, WA.

Furnas, G. W, (1997). Effective view navigation. In Proceedings of the Human Factors in Computing Systems, CHI '97 (pp. 367-374). Atlanta, GA: Association for Computing Machinery.

Heer, Jeffrey Ed H. Chi (2000) Identification of Web User Traffic Composition using Multi-Modal Clustering and Information Scent. in Proc. of the Workshop on Web Mining, SIAM Conference on Data Mining, April 2001, Chicago, IL. pp. 51-58

Krug, Steve (2000). Don't make me think: a common sense approach to web usability. Macmillan USA.

Marr, D. (1982) Vision. San Francisco: W.H. Freedman.

Pirolli, Peter, Stuart Card (1995). Information Foraging in Information Access Environments. In Proceedings of the Human Factors in Computing Systems, CHI '95. Association for Computing Machinery.

Pirolli, P. (1997). Computational models of information scent-following in a very large browsable text collection. In Proceedings of the Conference on Human Factors in Computing Systems, CHI '97 (pp. 3-10). Atlanta, GA: Association for Computing Machinery.

Pirolli, Peter, Stuart Card (1999). Information Foraging. Psychology Review Vol. 106, No. 4. (pp.643-675)

Reisberg, Daniel (2001). Cognition: exploring the science of the mind. 2nd ed. W.W. Norton & Company, Inc.

Russell, Daniel M., Mark J. Stefik, Peter Pirolli, Stuart K. Card (1993). The cost structure of sensemaking. Proceedings of the Conference on Human Factors in Computing Systems. Association for Computing Machinery.

Ward, David, Jacob Blaustein (1992) The role of satisficing in foraging theory. Oikos 63:2 (pp. 312-317).

Wexelblat, Alan, Pattie Maes (1999) In Proceedings of the Human Factors in Computing Systems, CHI '99. Association for Computing Machinery.

Thursday, August 07, 2008

HCI Forum Topic DISS 720

http://portal.acm.org/citation.cfm?doid=1378773.1378804

As adaptive agents become more complex and take increasing autonomy in their user's lives, it becomes more important for users to trust and understand these agents. Little work has been done, however, to study what factors influence the level of trust users are willing to place in these agents. Without trust in the actions and results produced by these agents, their use and adoption as trusted assistants and partners will be severely limited. We present the results of a study among test users of CALO, one such complex adaptive agent system, to investigate themes surrounding trust and understandability. We identify and discuss eight major themes that significantly impact user trust in complex systems. We further provide guidelines for the design of trustable adaptive agents. Based on our analysis of these results, we conclude that the availability of explanation capabilities in these agents can address the majority of trust concerns identified by users.

HCI Forum topic DISS 720

http://portal.acm.org/citation.cfm?doid=1357054.1357113

This paper describes the results of a study conducted to answer two questions: (1) Do children generalize their understanding of distinctions between conventional and moral violations in human-human interactions to human-agent interactions? and (2) Does the agent's ability to make claims to its own moral standing influence children's judgments? A two condition, between- and within-subjects study was conducted in which 60 eight and nine year-old children interacted with a personified agent and observed a researcher interacting with the same agent. A semi-structured interview was conducted to investigate the children's judgments and reasoning about the observed interactions as well as hypothetical human-human interactions. Results suggest that children do distinguish between conventional and moral violations in human-agent interactions and that the ability of the agent to express harm and make claims to its own rights significantly increases children's likelihood of identifying an act against the agent as a moral violation.