Cross-lingual Visual Pre-training for Multimodal Machine Translation
-
Updated
Dec 28, 2021 - Python
Cross-lingual Visual Pre-training for Multimodal Machine Translation
Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval [ECCV 2020]
A modular repository for developing Image Captioning Approaches
The main goal of is to show how precise the Faster R-CNN with ResNet-101 could find objects and there attributes in Conceptual 12m dataset.
Add a description, image, and links to the conceptual-captions topic page so that developers can more easily learn about it.
To associate your repository with the conceptual-captions topic, visit your repo's landing page and select "manage topics."