Recently, we have been exploring the use of time as a supervisory signal for learning from demonstration. As an example use case, we considered ultrasound scanning, where a technician is required to search for a scanning position and contact force that produces an optimal image. We propose a probabilistic temporal ranking (PTR) model that allows for the quality of an image to be determined from demonstration image sequences. This reward model can then be used for automated robotic ultrasound scanning, as shown in the video below.
This project page site has additional detail and explanation around PTR.
Michael Burke, Katie Lu, Daniel Angelov, Artūras Straižys, Craig Innes, Kartic Subr, Subramanian Ramamoorthy., Learning rewards for robotic ultrasound scanning using probabilistic temporal ranking, 2020, https://arxiv.org/abs/2002.01240