These images and descriptions are then processed through neural networks that learn to associate particular and deeply nuanced qualities of the image — shapes, colours, compositions — with certain words and phrases.
然后,这些图像和描述通过神经网络进行处理,神经网络学习将图像的特定和深层次的特征(形状、颜色、构图等)与特定的词语和短语相联系。