OmniArt is an artistic dataset consisting of over 15 online artwork collections and user generated art uploaded on the internet. Covering a period from year 157 BCE (Before Common Era) to August 28th 2017 it contains more than 500 types of artworks atributed to over 36000 artists around the world. Additional meta-data contains but is not limited to style, school, Iconclass, color codes and palletes, materials, current location, real dimensions, techniques and more. All of the contained images are saved in their native form from the collection they originate and the accompanying metadata is preserved in a .json file with the same id. For every dataset entry there is a persistent metadata file which is cleaned and parsed so that integration in a machine learning project is faster. If an attribute is not known for a dataset entry, it is marked as unknown.
The current skeleton dataset size 513 Gb and consists of .jpg images and .json metadata files. Additional colleciton specific meta-data, extracted features, quick sql immersion snippets and trained models are available in an addtional 550 Gb archive upon request. Given the size of the dataset, we generate the download links upon request.
Please notice that this dataset is made available for academic research purpose only. All the images are collected from the collections available on-line on the Internet and the copyright belongs to the original owners. If any of the images belongs to you and you would like it removed, please kindly inform us, we will remove it from our dataset immediately.
If you are using this dataset in your research, cite the paper where it was introduced. Bibtex entry if provided below.
@article{ strezoski2017omniart, title={OmniArt: Multi-task Deep Learning for Artistic Data Analysis}, author={Strezoski, Gjorgji and Worring, Marcel}, journal={arXiv preprint arXiv:1708.00684}, year={2017} }