The Circus Oz Living Archive collection#

Introduction#


Built in collaboration with the world’s oldest contemporary circus company, the Circus Oz Living Archive combines performer interviews and performance videos with audience engagement to create an interactive, immersive environment for experiencing the magic of circus over and over again.

Data Processing#


Data Extraction & Exploration#

The CircusOz data are manually collected from three data sources, and is provided in an xlsx format. As all the data are merged into one spreadsheet, it requires denormalisation of three distinct entities, namely person, event, and venue, and transformed into a SQLite database for the convenience of understanding the data.

For the details of constructing the CircusOz database, please refer to this jupyter notebook circusoz_DBConstruction.ipynb.

The following diagram shows the schema of CircusOz data:

Data Transformation & Loading#

Due to the manual collection process of CircusOz data, it is inevitable that some records may contain typographical errors and duplicate attributes. Therefore, the main goal of the transformation process is to perform deduplication while retaining the unique attributes from the three data sources.

On an entity level, the CircusOz entity projection is listed as follows:

CircusOz Entity (Collection)

ACDEA Entity

person

person

-

work

event

event

-

recognition

venue

place

-

organisation

-

resource

foreign keys across person, venue and event

relationship


On an attribute level, please find the attribute mapping details in the notes of the CircusOz data dictionary. The data dictionary can be downloaded below.


For details in transforming and loading the CircusOz data into the ACDE database, please refer to this jupyter notebook circusoz_Loading.ipynb.

Integration Data Report#


The following chart, which was generated by the jupyter notebook circusoz_IntegrationSummary.ipynb, illustrates the number of CircusOz records before and after integration.

Analytical examples#


For examples of how to use the integrated CircusOz data for analytical purposes, please refer to the following jupyter notebooks in the Data Analysis chapter of this book.

References#