OmniArt is an artistic dataset consisting of over 15 online artwork collections and user generated art uploaded on the internet. Covering a period from year 157 BCE (Before Common Era) to August 28th 2017 it contains more than 500 types of artworks atributed to over 36000 artists around the world. Additional meta-data contains but is not limited to style, school, Iconclass, color codes and palletes, materials, current location, real dimensions, techniques and more. All of the contained images are saved in their native form from the collection they originate and the accompanying metadata is preserved in a .json file with the same id. For every dataset entry there is a persistent metadata file which is cleaned and parsed so that integration in a machine learning project is faster. If an attribute is not known for a dataset entry, it is marked as unknown.
The current skeleton dataset size 513 Gb and consists of .jpg images and .json metadata files. Additional colleciton specific meta-data, extracted features, quick sql immersion snippets and trained models are available in an addtional 550 Gb archive upon request. Given the size of the dataset, we generate the download links upon request.
Unfortunately, the Omniart dataset is no longer available. If you'd like to make use of artwork images for your research we recommend the following alternative datasets:
If you are using this dataset in your research, cite the paper where it was introduced. Bibtex entry if provided below.
@article{ strezoski2017omniart, title={OmniArt: Multi-task Deep Learning for Artistic Data Analysis}, author={Strezoski, Gjorgji and Worring, Marcel}, journal={arXiv preprint arXiv:1708.00684}, year={2017} }