We provide VALOR-1M and VALOR-32K dataset for downloading. Due to copyright, only annotation files are available which contains video YouTubeID, start time, end time and audiovisual captions. You can download videos from YouTube, or direct extract according to VideoIDs if you have downloaded AudioSet.
VALOR-32K's annotation file can be downloaded here.
VALOR-1M's annotation file will be released soon.
VALOR's code and model are released on this github page.
If you meet any problems about VALOR model or data, you can open issues on the github page, or contact us.