Selected Publications

We propose a novel neural architecture for representing 3D surfaces by harnessing complementary explicit and implicit shape representations. We make these two representations synergistic by introducing novel consistency losses. Our hybrid architecture outputs results which are superior to the output of the two equivalent single-representation networks, yielding smoother explicit surfaces with more accurate normals, and a more accurate implicit occupancy function. Additionally, our surface reconstruction step can directly leverage the explicit atlas-based representation. This process is computationally efficient, and can be directly used by differentiable rasterizers, enabling training our hybrid representation with image-based losses.
ECCV, 2020.

We propose a novel approach for generating unrestricted adversarial examples by manipulating fine-grained aspects of image generation. Unlike existing unrestricted attacks that typically hand-craft geometric transformations, we learn stylistic and stochastic modifications leveraging state-of-the-art generative models. This allows us to manipulate an image in a controlled, fine-grained manner without being bounded by a norm threshold. We demonstrate that our attacks can bypass certified defenses, yet our adversarial images look indistinguishable from natural images as verified by human evaluation. Adversarial training can be used as an effective defense without degrading performance of the model on clean images.
Under Review, 2020.

We aim to build image generation models that generalize to new classes from few examples. To this end, we first investigate the generalization properties of classic image generators, and leverage this insight to produce a robust, unsupervised few-shot image generation algorithm using a novel training procedure. The resulting interpolative autoencoders synthesize realistic images of novel objects from only a few reference images, and are competitive with VAEs trained directly on the full set of novel classes. Our procedure is simple and lightweight, generalizes broadly, and requires no category labels or other supervision during training.
Under Review, 2020.

We propose a learning based method for generating new animations of a cartoon character given a few example images. We express pose changes as a deformation of a layered 2.5D template mesh, and devise a novel architecture that learns to predict mesh deformations matching the template to a target image. In addition to coarse poses, character appearance also varies due to shading, out-of-plane motions, and artistic effects. We capture these subtle changes by applying an image translation network to refine the mesh rendering. Our generative model can be used to synthesize in-between frames and to create data-driven deformation. Our template fitting procedure outperforms state-of-the-art generic techniques for detecting image correspondences.
WACV, 2019.

Differential privacy (DP) is a popular mechanism for training machine learning models with bounded leakage about the presence of specific points in the training data. The cost of differential privacy is a reduction in the model’s accuracy. We demonstrate that in the neural networks trained using differentially private stochastic gradient descent (DP-SGD), this cost is not borne equally: accuracy of DP models drops much more for the underrepresented classes and subgroups. We demonstrate this effect for a variety of tasks and models, including sentiment analysis of text and image classification. We then explain why DP training mechanisms such as gradient clipping and noise addition have disproportionate effect on the underrepresented and more complex subgroups.
NeurIPS, 2019.

We propose novel generative models for creating adversarial examples, slightly perturbed images resembling natural images but maliciously crafted to fool trained models. Our approach can produce image-agnostic and image-dependent perturbations for both targeted and non-targeted attacks. We also demonstrate that similar architectures can achieve impressive results in fooling both classification and semantic segmentation models, obviating the need for hand-crafting attack methods for each task. We improve the state-of-the-art performance in universal perturbations by leveraging generative models in lieu of current iterative methods. Our attacks are considerably faster than iterative and optimization-based methods at inference time. Moreover, we are the first to present effective targeted universal perturbations.
CVPR, 2018.

Estimating fundamental matrices is a classic problem in computer vision. Traditional methods rely heavily on the correctness of estimated key-point correspondences, which can be noisy and unreliable. As a result, it is difficult for these methods to handle image pairs with large occlusion or significantly different camera poses. In this paper, we propose novel neural network architectures to estimate fundamental matrices in an end-to-end manner without relying on point correspondences. New modules and layers are introduced in order to preserve mathematical properties of the fundamental matrix as a homogeneous rank-2 matrix with seven degrees of freedom. We analyze performance of the proposed model on the KITTI dataset, and show that they achieve competitive performance with traditional methods without the need for extracting correspondences.
ECCV, 2018.

We propose a novel generative model which is trained to invert the hierarchical representations of a bottom-up discriminative network. Our model consists of a top-down stack of GANs, each learned to generate lower-level representations conditioned on higher-level representations. A representation discriminator is introduced at each feature hierarchy to encourage the representation manifold of the generator to align with that of the bottom-up discriminative network. Unlike the original GAN that uses a single noise vector to represent all the variations, our SGAN decomposes variations into multiple levels and gradually resolves uncertainties in the top-down generative process. Based on visual inspection, Inception scores and visual Turing test, we demonstrate that SGAN is able to generate images of much higher quality than GANs without stacking.
CVPR, 2017.

Several online real estate database companies provide automatic estimation of market values for houses using a proprietary formula. Although these estimates are often close to the actual sale prices, in some cases they are highly inaccurate. One of the key factors that affects the value of a house is its interior and exterior appearance, which is not considered in calculating these estimates. In this paper, we evaluate the impact of visual characteristics of a house on its market value. Using deep convolutional neural networks on a large dataset of photos of home interiors and exteriors, we develop a novel framework for automated value assessment using these photos in addition to other home characteristics. By applying our proposed method for price estimation to a new dataset of real estate photos and metadata, we show that it outperforms Zillow’s estimates.
Machine Vision and Applications, 2017.



Reviewer for CVPR, NeurIPS, ICCV, ECCV, AAAI, IEEE TPAMI, IEEE Transactions on Multimedia, ACM Transactions on Privacy and Security, IEEE Transactions on Industrial Electronics

Program Committee Member for AAAI and CVPR Workshop on Adversarial Machine Learning


  • Generating a digital image using a generative adversarial network, with Shuai Zheng, Hadi Kiapour and Robinson Piramuthu

  • Object animation using Generative Neural Networks (pending), with Vladimir Kim, Jun Saito and Eli Shechtman

  • Learning Hybrid Shape Representations (pending), with Vladimir Kim, Matthew Fisher and Noam Aigerman