ICSI Research Review
Friday, October 11, 2013
1:45 - 5:00pm
Featured talks by ICSI research staff highlighting some of our latest results and new directions in computer science research. Talks will be given in the 6th floor lecture hall. Demos will be available in the 6th floor lobby and the demo room.
Agenda:
1:45 - 5:00 Demos
1:45 Introduction and Research Review
Prof. Nelson Morgan
Acting ICSI Director
2:00 "Research in the AI Group"
Prof. Jerome Feldman
Senior AI Researcher
2:40 "Blurring the Line: Human vs. Machine Performance when Annotating Consumer-Produced Videos for Geo-Location"
Dr. Gerald Friedland
Audio and Multimedia Research Director
3:20 Break
3:35 "Speech Research at ICSI"
Dr. Steve Wegmann
Speech Research Director
4:15 "Engaging With the Internet: Security & Performance"
Prof. Vern Paxson
Networking and Security Research Director
Abstracts:
2:00 "Research in the AI Group"
Prof. Jerome Feldman
Senior AI Researcher
The AI Group was one of the four initial ICSI projects combining UC Berkeley cognitive linguistics research (Charles Fillmore, George Lakoff, Paul Kay) with biologically motivated computational techniques to build deep semantic models of human cognition and communication. Current efforts range from applied multi-lingual corpus extraction tasks to conceptual studies on the nature of meaning.
The two largest projects, FrameNet and MetaNet, have both made significant progress this year and some of that will be reviewed. Some highlights from smaller projects will also be discussed.
2:40 "Blurring the Line: Human vs. Machine Performance when Annotating Consumer-Produced Videos for Geo-Location"
Dr. Gerald Friedland
Audio and Multimedia Research Director
Based on a human-subject study with over 11,000 experiments, this talk presents a human baseline for location estimation for different combinations of modalities (audio, audio/video, audio/video/text). Furthermore, this talk reports on the comparison of the accuracy of state-of-the-art multimodal location estimation systems with the human baseline. Although the overall performance of qualified humans' multimodal video location estimation is still better than current machine learning approaches, the difference is quite small: Overall only 59 percent of the videos are located more accurately by a 2/3 majority of humans than by the algorithm when all modalities are utilized. Our analysis suggests new directions and priorities for future work on the improvement of location inference algorithms.
3:35 "Speech Research at ICSI"
Dr. Steve Wegmann
Speech Research Director
This talk will give an overview of the ICSI Speech Group, its research interests, and the projects that it is currently engaged in. It will also describe in more detail recent results from Project OUCH (Outing Unfortunate Characteristics of HMMs) that give a deep, quantitative understanding of the failures of the acoustic model and front-end typically used for automatic speech recognition.
4:15 "Engaging With the Internet: Security & Performance"
Prof. Vern Paxson
Networking and Security Research Director
The Networking & Security Group pursues research in Internet security, Internet analysis, and Internet architecture. The first two of these have a heavy empirical emphasis, while the latter emphasizes design considerations. This talk will sketch a number of these efforts at a high level to convey a sense of the nature of the efforts, along with noteworthy recent results for some of them.
1:45 - 5:00 Demos
Multimodal Location Estimation
Jaeyoung Choi, Audio and Multimedia
This research addresses an approach to determining the geo-coordinates of the recording place of Flickr videos based on both textual metadata and visual cues. The underlying system has been tested on the MediaEval 2012 Placing Task evaluation data and is able to classify 14% of the videos to within an accuracy of 10m.
Pushing the Limits of Mechanical Turk: Qualifying the Crowd for Video Geo-Location
Luke Gottlieb, Audio and Multimedia
We will demonstrate the methods we have developed for finding Mechanical Turk participants for the manual annotation of the geo-location of random videos from the Web. We require high-quality annotations for this project, as we are attempting to establish a human baseline for future comparison to machine systems. This task is different from a standard Mechanical Turk task in that it is difficult for both humans and machines, whereas a standard Mechanical Turk task is usually easy for humans and difficult or impossible for machines.
Related Paper: L. Gottlieb, J. Choi, P. Kelm, T. Sikora, and G. Friedland. Pushing the Limits of Mechanical Turk: Qualifying the Crowd for Video Geo-Location. Proceedings of the ACM Workshop on Crowdsourcing for Multimedia (CrowdMM 2012), held in conjunction with ACM Multimedia 2012, pp. 23-28, Nara, Japan, October 2012.
Ready or Not?
Bryan Morgan, Audio and Multimedia
When you post on the Internet, you're often sharing not only the content of your posts, but metadata like geolocation (GPS tags) and time stamps associated with those posts. By putting all the information together strangers can infer quite a bit about you and your habits. This educational tool was created to help you visualize how part of your information footprint--the geolocation information and time stamps associated with posts on social media like Twitter, Instagram, and Facebook--could be used to find you in the physical world. The "Ready or Not?" app was developed as part of the Teaching Privacy project at the International Computer Science Institute/UC Berkeley. This NSF-sponsored* project aims to empower K-12 students and college undergrads in making informed choices about privacy, by building a set of educational tools and hands-on exercises to help teachers demonstrate what happens to personal information on the Internet--and what the effects of sharing information can be.
The SSL Tree of Trust
Johanna Amann, Networking and Security
In collaboration with ten large network sites, we have been collecting information about 15 billion SSL connections since the beginning of 2012, including 32 million unique SSL certificates extracted from online activity of about 300,000 users in total. We are using these to provide a public notary service that people can query for certificates they encounter. Furthermore, to better understand the relationships between root and intermediate Certificate Authorities (CAs), we used our data set to create the "Tree of Trust," an interactive graph visualizing global relationships between CAs. Our demonstration will show the Tree of Trust, highlight interesting parts, and provide information about the current state of the SSL ecosystem.
For more, see http://notary.icsi.berkeley.edu/#the-tree-of-trust.
Introducing the Netalyzr Mobile App
Nicholas Weaver, Networking and Security
Netalyzr is a tool that analyzes the extent to which a user's Internet service provider interferes with its customers' traffic. This presentation will demonstrate a new app that analyzes mobile phones’ connections. For more information and to try the tool on your computer, visit Netalyzr's Web site.
Real-Time Object Classification Using Deformable Part Models and Sparselets
Daniel Göhring, Vision
In mobile robotics, especially in household environments, it is crucial to have a fast and robust object recognition framework. Our detection approach applies deformable part models for object classification. The processing time was significantly reduced by using sparselets and by implementing the detection code in a Cuda framework. The demo shows how up to 30 different household objects can be recognized in cluttered environments and in realtime.
DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
Yangqing Jia, Vision
We proposed DeCAF, a general image feature based on the recent success of convolutional neural networks, which is able to learn much more semantic feature representation of images than existing hand-crafted approaches. We have designed an online demo showing the performance of DeCAF on the ImageNet large scale visual recognition dataset, and are open-sourcing the codebase for the research community as a more accessible resource for deep learning and large-scale image recognition.