Updates and notes as we transition from V0.1 to V0.2
In this pub, I cover my work updating the Catalogue’s Graph Data Model from V0.1 to V0.2.
I am trying to figure out how to limit the scope of the Event entity because I want to limit Events to social commentary, rather than including citations and references, which I prefer to have as relations between Works. If you have some ideas on this matter, please send a message.
I am also thinking about whether to keep or remove the HAS_AUTHORED relationship. I think I will probably change HAS_AUTHORED to CONTRIBUTED_TO, and add a role property (and possibly a taxonomy property) to describe the role and any taxonomy the role falls under (e.g., CRediT). I believe doing so will better show the ways people and organizations contribute to a work (e.g., editing, feedback, or proofreading), and to improve the data quality and the usefulness of the contributions sub-graph.
I may also add a type property to Organization, so I can distinguish between the types of organizations (e.g., businesses, teams, joint ventures), and create relationships between organizations and their sub-organizations (e.g., IS_DEPARTMENT_OF, IS_SUBSIDIARY_OF).
I may replace the CO_AUTHORED_WITH relationship with COLLABORATED_WITH, and have a type property for the type of collaboration (e.g., co-authorship).
2024-08-31: I added WORKS_PUBLISHED_IN, UNCLASSIFIED_CONTRIBUTION_IN, and GUEST_APPERANCE_IN as relationships between a Person/Organization and Media Source, so that we can see how a Person/Organization has interacted with a Media Source beyond managing a Media Source (i.e., the MANAGES relationship).
Additionally, I am still thinking about the properties for relations and entities. I am thinking about which properties to include based on things in schema.org that would make sense in the Graph Data Model.
So, I will probably add properties last, after confirming the entities and relationships2 I want to have for the graph data model.
Though, regarding basic properties for Works, Media Sources, and Persons, I settled on the following properties:
Works:
title
subtitle
description
workURL
associatedMedia
datePublished
thumbnailURL
distroidID
Media Sources:
title
description
mediaSourceURL
thumbnailURL
distroidID
I removed manager as a property because being a manager is a relation between a Person or Organization and Media Source.
Person:
name
personURL
description
givenName
familyName
thumbnailURL
distroidID
Organization:
name
description
organizationURL
thumbnailURL
distroidID
For Organization, I will probably need to refer to an external schema
I may make [word]URL an object (most likely a list) to hold multiple URLs related to the entity.
I will probably not include an example (i.e., small) graph for this transition as I did for V0.1.
Instead, I will work on updating the Catalogue’s Knowledge Graph (KG) data pipeline to populate the KG with Works collected from Media Sources, and formatting the data collected to comply with the Graph Data Model V0.2.
I am seeking feedback on this pub for any improvements to make, errors to correct, or other areas to explore.
Please leave your feedback here, on the Ledgerback discussion forum, or on Twitter.