Standard architectures for neural networks have numerous problems with interpretability, flexibility and generalisation. I believe that this is in large part due to a lack of stronger inductive biases in models and architectures, and have recently been pushing (see my job talk at Monash) to include stronger biases in deep learning models.
Switching controller front-ends
For example, by embedding known controllers in a model, in addition to knowledge of a switching structure, we gain better performance in settings where hybrid control is required, along with greater interpretability.
M Burke, Y Hristov, S Ramamoorthy, Switching Density Networks for Hybrid System Identification, Conference on Robot Learning (CoRL) 2019. (arxiv link)
Similarly, by autoencoding with light supervision, we can ground perception networks in symbolic concepts that align with natural language for greater interpretability, while allowing for planning and symbolic reasoning.
Y Hristov, D Angelov, M Burke, Alex Lascarides, S Ramamoorthy, Disentangled Relational Representations for Explaining and Learning from Demonstration, Conference on Robot Learning (CoRL) 2019. (arxiv link)
Video to physical parameters
The same idea can allow for parameter estimation from video, and the incorporation of physical dynamics into a model.
M Asenov, M Burke, D Angelov, T Davchev, K Subr, S Ramamoorthy, Vid2Param: Modelling of Dynamics Parameters from Video, Robotics and Automation Letters (RA-L) (arxiv link).
The approaches above inject constraints through training, but for generalisation, we may need even stronger priors built into models. Our work on physics-as-inverse graphics does this by including differentiable physical equations in the model, and exhibits much stronger extrapolation performance.
M Jacques, M Burke, T Hospedales, Physics-as-Inverse-Graphics: Unsupervised Physical Parameter Estimation from Video, International Conference on Learning Representations (ICLR 2020) (open review link)