A Dataset of Text Prompts, Videos and Video Quality Metrics from Generative Text-to-Video AI Models

Chivileva, Iya; Lynch, Philip; Ward, Tomás; Smeaton, Alan F.

Chivileva, Iya, Lynch, Philip, Ward, Tomás ORCID: 0000-0002-6173-6607 and Smeaton, Alan F. ORCID: 0000-0003-1028-8389 (2024) A Dataset of Text Prompts, Videos and Video Quality Metrics from Generative Text-to-Video AI Models. Data In Brief, 54 . p. 110514. ISSN 2352-3409

Abstract
Metadata
Downloads
Documents
Metrics

[+]

Evaluating the quality of videos which have been automatically generated from text-to-video (T2V) models is important if the models are to produce plausible outputs that convince a viewer of their authenticity. This paper presents a dataset of 201 text prompts used to automatically generate 1,005 videos using 5 very recent T2V models namely Tune-a-Video, VideoFusion, Text-To-Video Synthesis, Text2Video-Zero and Aphantasia. The prompts are divided into short, medium and longer lengths. We also include the results of some commonly used metrics used to automatically evaluate the quality of those generated videos. These include each video’s naturalness, the text similarity between the original prompt and an automatically generated text caption for the video, and the inception score which measures how realistic is each generated video. Each of the 1,005 generated videos was manually rated by 24 different annotators for alignment between the videos and their original prompts, as well as for the perception and overall quality of the video. The data also includes the Mean Opinion Scores (MOS) for alignment between the generated videos and the original prompts. The dataset of T2V prompts, videos and assessments can be reused by those building or refining text-to-video generation models to compare the accuracy, quality and naturalness of their new models against existing ones.

Item Type:	Article (Published)
Refereed:	Yes
Uncontrolled Keywords:	Generative AI; Video annotation; Video naturalness; Video perception; Video alignment;
Subjects:	Computer Science > Artificial intelligence Computer Science > Image processing Computer Science > Machine learning Computer Science > Multimedia systems Computer Science > Digital video
DCU Faculties and Centres:	UNSPECIFIED
Publisher:	Elsevier BV
Official URL:	http://www.journals.elsevier.com.dcu.idm.oclc.org/...
Copyright Information:	Authors
Funders:	Science Foundation Ireland
ID Code:	30022
Deposited On:	21 May 2024 09:01 by Alan Smeaton . Last Modified 11 Dec 2024 10:16

Full text available as:

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial 4.0
2MB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

A Dataset of Text Prompts, Videos and Video Quality Metrics from Generative Text-to-Video AI Models

Altmetric Badge

Dimensions Badge

Downloads