Education

  • Ph.D., Computer Science, Brown University, 2015.
  • M.S., Computer Science, University of Colorado, Boulder, 2009.
  • M.S., Computer Science, University of Houston, 2006.
    • Thesis – A Machine Learning approach for automated geomorphic map generation.
  • B.E., Computer Engineering, University of Mumbai, 2004.

Professional Experience

  • Research Staff Member, IBM Research, Cambridge, MA. Feb 2016 - Present
    • Probabilistic Methods for Longitudinal Data: Researching models and methods for drawing inferences from sparse temporal data arising in healthcare settings.
    • Bayesian Deep Learning: Developing methods for learning Bayesian neural networks and combining them with probabilistic graphical models. Exploring methods for learning sparse and interpretable neural networks.
    • Mentorship: Advising Postdoctoral researchers and interns on research projects and directions.
  • Post Doctoral Research Scientist, Disney Research, Cambridge, MA. Oct 2014 - Jan 2016
    • Efficient Learning of Bayesian Neural Networks : Developed algorithms for scalable online learning of Bayesian neural networks.
    • Gesture Recognition : Explored methods for recognizing gestures from videos and acceleration signals extracted from wearable sensors.
    • User Behavior Analysis: Statistical models for analyzing guest patterns in Disney parks.
  • Research Assistant, Computer Science, Brown University, Providence, RI. Fall 2009 - 2014
    • Statistical Modeling: Designed novel probabilistic models for unsupervised discovery of topics from text collections, regions from images/videos and parts from 3D objects.
    • Bayesian Nonparametrics: Developed novel models that increase in complexity with increasing amounts of data for analyzing spatio-temporally correlated data.
    • Inference: Built effective, robust and reliable stochastic search and MCMC based inference algorithms for latent variable models.
  • Research Intern, Microsoft Research (MSR), Cambridge, MA. Jun-Aug 2013
    • Large Scale Density Modeling: Developed parallel Expectation Maximization algorithms for learning large scale mixtures of factor analyzers from several million image patches. The learned density resulted in state-of-the-art image de-noising performance.
    • Image recognition: Explored dictionary learning algorithms for learning efficient representations for downstream classification using SVMs, L2 and L1 regularized logistic regression models.
  • Research Intern, Disney Research, Pittsburgh, PA. Jun-Aug 2012
    • Video Segmentation: Developed hierarchical Bayesian nonparametric models and efficient Metropolis Hastings (MCMC) samplers for discovering the number and extent of regions exhibiting coherent appearance and motion in video sequences.
  • Research Assistant, Computer Science, University of Colorado, Boulder, CO. Spring 2007-09
    • Document modeling: Utilized topic models for measuring the relevancy of user contributed documents to subject-themed digital libraries.
    • Autonomous robot navigation: Designed computer vision algorithms and features for obstacle avoidance and traversable path identification to aid robots navigate in unstructured environments.
  • Research Intern, Bosch Research, Pittsburgh, PA. Jun-Aug 2008
    • Learning from class proportions: Developed algorithms for learning finite mixture models utilizing non traditional sources of information such as prior knowledge of class proportions. These algorithms require only a fraction of training data and are more noise tolerant.
  • Summer Researcher, Lunar & Planetary Institute, Houston, TX. Jun-Jul 2007
    • Geomorphic mapping: Developed tools for categorizing planetary surfaces into geomorphic landforms (craters, ridges, etc.) from digital elevation data. The work involved developing an over-segmentation algorithm for segmenting Digital Elevation Maps (DEMs) and careful feature design for identifying landforms.
  • Summer Intern, National Radio Astronomy Observatory, Greenbank, WV. Jun-Aug 2005
    • Telescope analytics: Identified and captured leading indicators of radio telescope failures. Developed an extensible system for periodic logging of the relevant indicators in a MySQL database.

Publications

Preprints

  • Model Selection in Bayesian Neural Networks via Horseshoe Priors.
    Soumya Ghosh and Finale Doshi-Velez.
    arXiv 2017.

Refereed Conference Proceedings

  • Structured Variational Learning of Bayesian Neural Networks with Horseshoe Priors
    Soumya Ghosh, Jiayu Yao, and Finale Doshi-Velez.
    35th International Conference on Machine Learning (ICML), 2018.
  • Context-Sensitive Prediction of Facial Expressivity Using Multimodal Hierarchical Bayesian Neural Networks.
    Ajjen Joshi, Soumya Ghosh, Sarah Gunnery, Linda Tickle-Degnen, Stan Sclaroff, and Margrit Betke.
    13th IEEE International Conference on Automatic Face & Gesture Recognition (FG), 2018.
  • Early Prediction of Diabetes Complications from Electronic Health Records: A Multi-task Survival Analysis Approach.
    Bin Liu, Ying Li, Zhaonan Sun, Soumya Ghosh, and Kenney Ng.
    32nd AAAI Conference On Artificial Intelligence (AAAI), 2018.
  • Personalizing Gesture Recognition Using Hierarchical Bayesian Neural Networks.
    Ajjen Joshi, Soumya Ghosh, Margrit Betke, Stan Scarloff, Hanspeter Pfister.
    Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  • An Exploration of Latent Structure in Observational Huntington’s Disease Studies.
    Soumya Ghosh, Zhaonan Sun, Ying Li, Yu Cheng, Amrita Mohan, Cristina Sampaio, Jianying Hu.
    AMIA Joint Summits on Translational Science, CRI, 2017.
  • A Data-Driven Method for Generating Robust Symptom Onset Indicators in Huntington’s Disease Registry Data.
    Zhaonan Sun, Ying Li, Soumya Ghosh, Yu Cheng, Amrita Mohan, Cristina Sampaio, Jianying Hu.
    American Medical Informatics Association Annual Symposium (AMIA), 2017.
  • Clinical Trials.Gov: A Topical Analyses.
    Vibha Anand, Amos Cahan, Soumya Ghosh.
    AMIA Joint Summits on Translational Science, CRI, 2017.
  • Is there a Priority Shift in Mental Health Clinical Trials?
    Vibha Anand, Soumya Ghosh, Amit Anand.
    World Congress on Medical and Health Informatics (Medinfo), 2017.
  • Deep State Space Models for Computational Phenotyping.
    Soumya Ghosh, Yu Cheng, Zhaonan Sun.
    IEEE International Conference on Health Informatics (ICHI), 2016.
  • Assumed Density Filtering Based Methods for Scalable Learning of Bayesian Neural Networks.
    Soumya Ghosh, Francesco Delle Fave, Jonathan Yedidia.
    30th AAAI Conference On Artificial Intelligence (AAAI), 2016.
  • Nonparametric Clustering with Distance Dependent Hierarchies.
    Soumya Ghosh, Michalis Raptis, Leonid Sigal, Erik Sudderth.
    30th Conference on Uncertainty in Artificial Intelligence (UAI), 2014.
  • From Deformations to Parts: Motion-based Segmentation of 3D Objects.
    Soumya Ghosh, Erik B. Sudderth, Matthew Loper, Michael J. Black.
    Advances in Neural Information Processing Systems 25 (NIPS), 2012.
  • Nonparametric learning for layered segmentation of natural images.
    Soumya Ghosh, Erik B. Sudderth.
    IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2012.
  • Spatial distance dependent Chinese restaurant processes for image segmentation.
    Soumya Ghosh, Andrei B. Ungureanu, Erik B. Sudderth, David Blei.
    Advances in Neural Information Processing Systems 24 (NIPS), 2011.
  • A segmentation guided label propagation scheme for autonomous navigation.
    Soumya Ghosh, Jane Mulligan.
    IEEE International Conference on Robotics and Automation (ICRA), 2010.
  • A general framework for reconciling multiple weak segmentations of an image.
    Soumya Ghosh, Joseph Pfeiffer III, Jane Mulligan.
    Workshop on Applications of Computer Vision (WACV), 2009.
  • Topic Model Methods for Automatically Identifying Out-of-Scope Resources.
    Steven Bethard, Soumya Ghosh, James Martin, Tamara Sumner.
    Joint Conference on Digital Libraries (JCDL), 2009.
  • Using Weak Supervision in Learning Gaussian Mixture Models.
    Soumya Ghosh, Soundar Srinivasan, Burt Andrews.
    International Joint Conference on Neural Networks (IJCNN), 2009.
  • Machine learning for automatic mapping of planetary surfaces.
    Tomasz F. Stepinski, Soumya Ghosh, Ricardo Vilalta.
    Proceedings of the 19th National Conference on Innovative Applications of Artificial Intelligence (IAAI), 2007.
  • Automatic recognition of landforms on Mars using terrain segmentation and classification.
    Tomasz F. Stepinski, Soumya Ghosh, Ricardo Vilalta.
    International Conference on Discovery Science (DS), 2006.

Refereed Workshop Abstracts

  • Model Selection in Bayesian Neural Networks via Horseshoe Priors.
    Soumya Ghosh and Finale Doshi-Velez.
    NIPS Workshop on Bayesian Deep Learning, 2017.
  • Exploring Factors that are associated with missing Values in observational Huntington’s disease study data.
    Zhaonan Sun, Ying Li, Soumya Ghosh, Yu Cheng, Amrita Mohan, Cristina Sampaio, Jianying Hu.
    AMIA Joint Summits on Translational Science, 2017.
  • Hierarchical Bayesian Neural Networks for Personalized Classification.
    Ajjen Joshi, Soumya Ghosh Margrit Betke, and Hanspeter Pfister.
    NIPS Workshop on Bayesian Deep Learning, 2016.
  • Approximate Bayesian Computation for Distance- Dependent Learning.
    Soumya Ghosh and Erik Sudderth.
    NIPS Workshop on Bayesian Nonparametrics: The Next Generation (NIPSW), 2015.

Journal Articles

  • Automatic Annotation of Planetary Surfaces With Geomorphic Labels.
    Soumya Ghosh, Tomasz F. Stepinski, Ricardo Vilalta.
    IEEE Transactions on Geoscience and Remote Sensing, 48, 175–185, 2010.

Invited Papers

  • Machine Learning Tools for Automatic Mapping of Martian Landforms.
    Tomasz F. Stepinski, Ricardo Vilalta and Soumya Ghosh.
    IEEE Intelligent Systems 22, 6, 100–106, Nov 2007.

Unrefereed Abstracts

  • Automatic Mapping of Martian Landforms Using Segmentation-based Classification.
    Soumya Ghosh, Tomasz F. Stepinski, and Ricardo Vilalta.
    Lunar and Planetary Science Conference (LPSC), 2007.

Teaching and Mentorship

  • Teaching Assistant: CSCI 1950-f, Introduction to Machine Learning, Spring 2011.
  • Guest Lecture: Introduction to Machine Learning tools for classification, MMSCI program , Harvard Medical School, Fall 2017.
  • Student (Intern) Supervision:
    • Ethan Evans, (Ph.D. candidate at Dept. of Chemistry, MIT), Winter 2018.
    • Michael Colomb, (UIUC), Summer 2017.
    • Giridhar Gopalan, (Ph.D. candidate at Dept. of Statistics, Harvard University), Summer 2015.
    • Ajjen Joshi, (Ph.D. candidate at Dept. of Computer Science, Boston University), Summer 2015.

Awards and Honors

  • 2016 ICHI Data Challenge Winner.
  • 2014 UAI travel grant.
  • 2011 NIPS travel grant.
  • 2007 AAAI travel grant.
  • 2009-10 Brown University graduate fellowship.
  • 2004-05 JN TATA endowment scholarship.

Talks

  • Deep Generative Models.
    • Fraenkel lab, MIT, February 2018.
  • Deep Learning — A cautionary tale.
    • North East Computational Health Summit, IBM Research, Yorktown Heights, 2017.
  • Introduction to classification.
    • Guest lecture at MMSCI, Harvard Medical School, 2017.
  • Learning and inference in distance dependent models.
    • IVC seminar series, Boston University, Nov 2015.
  • Bayesian nonparametric discovery of layers and parts from scenes and objects.
    • Philips Research, Briarcliff Manor, NY, Jun 2015.
    • IBM Research, Yorktown Heights, NY, Apr 2014.
    • HP, Palo Alto, CA, Apr 2014.
    • Bosch Research, Palo Alto, CA, May 2014.
    • Exxon Mobil, Corporate Strategic Research, Clinton, NJ, May 2014.
    • BBN Technologies, Cambridge, MA, May 2014.
    • Disney Research, Boston, MA, May 2014.
    • Schlumberger-Doll Research center, Cambridge, MA, May 2014.
  • Statistical models for spatially correlated data.
    • Microsoft Research, New England, June 2013.
  • Nonparametric learning for layered segmentation of natural images.
    • Disney Research, Pittsburgh, June 2012

Service

  • Organizing Committee Member:
    • North East Computational Health Summit (NECHS) 2017, 2018.
  • Program Committee Member / Reviewer:
    • Advances in Neural Information Processing Systems (NIPS), 2017, 2016, 2015, 2014 and 2013.
    • International Conference on Machine Learning (ICML), 2018, 2017.
    • IEEE Computer Vision and Pattern Recognition (CVPR), 2018, 2017, 2016, 2015.
    • AAAI Conference on Artificial Intelligence (AAAI), 2018, 2017, 2016.
    • European Conference of Computer Vision (ECCV), 2016.
    • International Conference of Computer Vision (ICCV), 2015.
    • NIPS Workshop on Machine Learning for Health, 2017.
    • NIPS Workshop on Practical Bayesian Nonparametrics, 2016.
    • NIPS Workshop on Bayesian Nonparametrics: The next generation, 2015.
  • External Journal Reviewer:
    • Artificial Intelligence 2016.
    • IET Computer Vision, May 2013.
    • Springer Journal of Signal, Image and Video Processing, May 2012.
  • PhD Admissions Committee: Brown University, Dec 2013.
  • Graduate Student Orientation Committee: Brown University, September 2011.