Parts and Attributes
Third International Workshop on Parts and Attributes
In Conjunction with the European Conference on Computer Vision (ECCV 2014)
Date: September 12th, 2014
Venue: Zurich, Switzerland
Photos
Overview
The workshop will bring together researchers from the established field of part-based methods and from the field of attribute-based methods, which has recently gained popularity. Participants will learn from each other about recent developments and applications, for example in object recognition, scene classification and image retrieval, and they will have the opportunity to discuss similarities and differences, advantages and disadvantages of both approaches.
Organizers
Schedule
08:40 | Welcome |
|
08:45 | Invited
Talk: Thomas
Mensink
(University of Amsterdam) - "COSTA:
Co-Occurrence
Statistics for Zero-Shot Classification" [Slides] In this talk I will introduce the first zero-shot classification method for multi-labeled image datasets. Our method, COSTA, exploits co-occurrences of visual concepts in images for knowledge transfer. These inter-dependencies arise naturally between concepts in multi-labelled datasets, and are easy to obtain from existing annotations or web-search hit counts. We estimate a classifier for a new label, as a weighted combination of related classes, using the co-occurrences to define the weight. We also propose a regression model for learning a weight for each label in the training set, which we learn in a leave-one-out setting, which improves significantly the performance. Finally, we also show that our zero-shot classifiers can serve as priors for few-shot learning. Experiments on three multi-labeled datasets reveal that our proposed zero-shot methods, are approaching and occasionally outperforming fully supervised SVMs. We conclude that co-occurrence statistics suffice for zero-shot classification. |
|
09:20 | Invited Talk: Ali Farhadi (University of Washington) - "Attributes at Scale" | |
09:55 | Invited
Talk: Raquel
Urtasun
(University of Toronto) - "Understanding
Complex Scenes and
People that Talk about Them" |
|
10:30 | Coffee
Break |
|
11:00 | Invited
Talk: Gregory Murphy
(New York
University) - "When
Are Categories More Useful Than Attributes?
A Perspective From Induction" [Slides] Categories are useful because they allow us to infer attributes of an object that were not themselves observed during categorization. However, some attributes can be directly inferred from an object's perceptible properties. I discuss two sets of experiments that test whether people make such attribute-to-attribute inductions, and whether they rely more on these or on category-to-attribute inductions. |
|
11:35 | Invited
Talk: Adriana
Kovashka
(University of Pittsburgh) - "Interactive
Image Search with
Attributes" [Slides] Search engines have come a long way, but searching for images is still primarily restricted to meta information such as keywords as opposed to the images' visual content. We introduce a new form of interaction for image retrieval, where the user can give rich feedback to the system via semantic visual attributes. The proposed WhittleSearch approach allows users to narrow down the pool of relevant images by comparing the properties of the results to those of the desired target. Building on this idea, we develop a system-guided version of the method which actively engages the user in a 20-questions-like game where the answers are visual comparisons. This enables the system to obtain that information which it most needs to know. To ensure that the system interprets the user's attribute-based queries and feedback as intended, we further show how to efficiently adapt a generic model for an attribute to more closely align with the individual user's perception. Our work transforms the interaction between the image search system and its user from keywords and clicks to precise and natural language-based communication. We demonstrate the dramatic impact of this new search modality for effective retrieval on databases ranging from consumer products to human faces. This is an important step in making the output of vision systems more useful, by allowing users to both express their needs better and better interpret the system's predictions. |
|
12:10 | Lunch
Break |
|
14:00 | Invited
Talk: Niloy
Mitra
(University College London) - "Abstracting
Collections of
Objects
and Scenes" [Slides] 3D data continues to grow in the form of collections of models, scenes, scans, or of course as image collections. Such data, when appropriately abstracted and represented, can provide valuable priors for many geometry processing tasks, including editing, synthesis, and form-finding. In this talk, I will discuss the various algorithms we have developed over the last years to co-analyze large 3D data collections and represent them as probability distributions over part-based abstractions. Such an approach focuses more on the global inter and intra semantic relations among the parts of the shape rather than on their local geometric details. Beyond analysis techniques, I will discuss applications in editing, modeling, and fabrication. |
|
14:35 | Invited
Talk: Peter
Gehler
(Max
Planck Institute) - "Fields
of Parts" [Slides] Part based models are ubiquitous in human pose estimation and object detection. In this talk I will present the Fields of Parts model that offers a different viewpoint on the classical Pictorial Structures model. The Fields of Parts model can be understood as an unrolled mean field inference machine. We train it with a maximum margin estimator using mean-field backpropagation. I will establish the link between the PS model, the Fields of Parts model, and a multilayer neural network with convolutional kernels. I will argue that it offers interesting new modeling flexibility as it paves the way to joint body pose estimation and segmentation. |
|
15:10 | Poster
Session (and Coffee Break) Check the list of accepted posters |
|
16:40 | Invited
Talk: Shih-Fu
Chang
(Columbia University) - "Concept-Based
Framework for Detecting
High-Level Events in Video" Attributes and parts are intuitive representations for real-world objects and have been shown effective in recent research on object recognition. An analogous framework has been used in the multimedia community using "concepts" for describing high-level complex events such as "birthday party" or "changing a vehicle tire." Concepts involve objects, scenes, actions, activities, and other syntactic elements usually seen in video events. In this talk, I will address several fundamental issues encountered when developing concept-based event framework - how to determine the basic concepts needed by humans when annotating video events; how to use Web mining to automatically discover a large concept pool for event representation; how to handle the weak supervision problem when concept labels are assigned to long video clips without precise timing; and finally how the concept classifier pool can be used to help retrieve novel events that have not been seen before (namely the zero-shot retrieval problem). |
|
17:15 | Invited
Talk: Serge Belongie
(Cornell
Tech) - "Visipedia
Tool Ecosystem" [Slides] To support scalable computer vision applications, we have built a suite of tools that allow for efficient collection and annotation of large image datasets. The tools are designed to both reduce data management overhead and foster collaborations between vision researchers and groups seeking the benefits of a computer vision application. |
|
17:50 | Concluding Remarks |
Important Dates
Submission deadline: June 30th
July 25th, 2014, 11:59 pm EST
Notification
of acceptance: July
10th, July 31st, 2014
Camera
ready submission: July 17th,
2014
Workshop
date: September 12th, 2014
Submissions
Four-page (excluding references) extended abstracts in ECCV 2014 format. Abstracts describing new, published (e.g. at main ECCV 2014 conference) or ongoing work are welcome. There will be no proceedings.
The extended abstract should be submitted as a single PDF file via email to the workshop organizers.
Contributions from the
following domains
or closely-related areas are especially welcome:
Deformable
and rigid part-based
models
Generative
and discriminative
part-based models
Unsupervised
discovery of parts
Context
and hierarchy in part-based
models
Part-sharing
methods for visual
recognition
Learning
visual attributes across
object classes
Attribute-based
classification and
search
Semantic
attributes as object
representations
Mid-level
representations based on
parts/attributes
Transfer
learning / Zero-shot
learning
Fine-grained
visual categorization
based on parts and attributes
Innovative
applications related to
parts and attributes
Accepted
submissions will be
presented as posters at the workshop.
Reviewing of abstract
submissions will be
single-blind i.e. submissions need not be anonymized.
Previous Iterations