QUESTION IMAGE
Question
researchers at carnegie mellon university and olin college of engineering have explored the use of contact microphones as an alternative to conventional tactile sensors for training machine learning models in robotic manipulation, hypothesizing that audio data could enhance performance. they pre - trained a self - supervised machine learning model on audio - visual representations from the audioset dataset, which includes over 2 million 10 - second video clips. utilizing audio - visual instance discrimination (avid), the model learned to differentiate between various types of audio - visual data. the researchers concluded that contact microphones provide valuable audio - based information that may make large - scale audio - visual pretraining a worthwhile venture, as they observed that ______
15
which choice most logically completes the text?
a despite the sacrifice in overall performance, the inclusion of audio data was able to return certain creative outputs that could not be attained with the more traditional visual - tactile sensors.
b especially when dealing with unfamiliar objects and locations, robotic manipulation became more efficient with the inclusion of audio data when it was processed separately rather than alongside visual data.
c in a series of tests, robots using this model completed real - world manipulation tasks with up to 60 demonstrations per task, outperforming models that relied solely on visual data.
d data from conventional tactile sensors significantly improved performance on real - world manipulation tasks compared to data from only visual sensors that provided monochromatic differentiation for intensity.
The text states the hypothesis that audio data enhances robotic manipulation ML model performance, and the conclusion is that contact microphones make audio-visual pretraining worthwhile. Option C provides concrete, test-based evidence that the audio-inclusive model outperforms visual-only models, directly supporting the conclusion. The other options either contradict the performance hypothesis (A), introduce an unrelated processing detail (B), or focus on tactile/visual sensors instead of the audio-based model (D).
Snap & solve any problem in the app
Get step-by-step solutions on Sovi AI
Photo-based solutions with guided steps
Explore more problems and detailed explanations
C. in a series of tests, robots using this model completed real-world manipulation tasks with up to 60 demonstrations per task, outperforming models that relied solely on visual data.