Controlling Access to IPKGs

Granting the proper access to knowledge to the right roles

Dan McCreary
6 min readDec 11, 2022
Figure 1: A high-level model of how roles are associated with permission assignments with a graph or vertex of a PKG. Image by the author.

This is the fourth blog in a series of blogs on integrating Personal Knowledge Graphs (PKGs) in large organizations. Integrated PKGs (IPKGs) are concerned with how siloed PKGs can be merged with shared knowledge graphs. Prior blogs focused on PKG enterprise integration concepts and the steps needed to justify large integration projects. This blog will focus on various strategies to control access to various PKGs.

Access control in a single person’s PKG is simple. You have full rights to everything in your PKG; by default, no one else can see anything. This strategy makes the security design of most PKGs simple. Your PKG can easily link to external public knowledge and will be internally consistent.

The audience for your PKG is just yourself. You may not care about formatting rules, spelling checks, or consistency with organization standards. As you begin to share your knowledge with others, you may be more concerned with the spelling, formatting, and consistency with other knowledge.

Before we dive into the security architecture, we need to define a few terms.

Basic Graph Security Terminology

Let’s start out with some basic terms and then progress to the more complex concepts.

  • Each individual person creates one or more PKGs. Users always have full access to their own personal knowledge graphs (PKGs).
  • PKGs consist of a set of named concepts represented as documents with distinct names and, optionally, a type. Each concept is called a vertex.
  • PKGs may not have two concepts with exactly the same name. If you enter a duplicate concept, they can be renamed or merged together.
  • The entirety of all the personal knowledge graphs is called the enterprise graph.
  • Any part of the enterprise graph is called a subgraph.
  • The public graph is a portion of the enterprise graph that all users can read. All users can always link to concepts in the public graph.
  • Each user in an organization will be associated with one or more access roles in your organization.
  • Access roles have access types such as read, rename, move, update, and delete.
  • Users may also be granted access to subsections of the enterprise graph based on their role. This access is called role-based access control or RBAC.
  • Granting access to any resources for a role is called authorization.
  • Verifying that a user is whom they say they are is called authentication. Most PKGs use either single-factor (password) or multi-factor authentication.
  • The default access is the permissions that are applied when a user creates a new vertex in a graph. The default access may be different based on the location of a vertex in a subgraph or business rules.

Users Authentication

An enterprise will store a user’s login credentials and policy in a centralized database. Most graph products use a protocol called LDAP to verify the user’s credentials and, upon validation, send a list of the roles that users have to the application that logs them in. Because authentication processes are common to most databases, we don’t spend much time on this topic here. Our focus is on granting access to the correct part of an enterprise group to a person.

Creation of Shared Subgraphs

Users may create a separate graph in their personal space and then grant access rights to individuals or roles within the organization. For example, you might create a subgraph for a specific project or team that works together. You can then grant full read and write permissions to other team members. This feature is easy to implement and available in most PKG products today, such as Roam Research.

The challenge is, what if some of your knowledge is in your personal space, and you want to share a read-only copy with a group of peers? You want them to be able to add relationships to your knowledge base but keep your foundational graph the same.

The Ontology Management Problem

Figure 2: The ontology management problem requires many users to be able to change low-level ontologies but restricts change control on the upper levels of the ontology to those that understand the impact that these changes have on consumers of the ontology.

This type of problem comes up frequently in linked ontology design. Ontologies have multiple levels, such as upper ontology, middle ontology, and lower ontology. The higher up we go, the more lower-level components our changes might impact.

Many small changes are frequently made to the lowest levels of ontology because a mistake can have a limited scope. Adding a new term, renaming a term, or fixing a typo in a name are examples of small changes. Changes to higher levels may ripple through many business processes that depend on consistent structures.

To get around change-control concerns, enterprise-scale ontology management tools must allow you to make changes in a controlled environment and then run consistent regression tests that simulate the graph queries that downstream consumers will also run. Significant changes in upper levels of ontologies require an entirely new version of the ontology and allow downstream consumers to hold off updating their systems until they have modified their business logic to accommodate these changes.

We introduced this example to demonstrate that personal note-taking is far from formal ontology management. The same knowledge graph infrastructure can support both processes, but having robust role-based access control is essential for high-stakes knowledge graph management.

Leveraging Named Groups

A better way to handle access to a subgraph is to grant access to a named group name such as “Team47”. Then as individuals join or leave that team, we will not need to change our subgraph access permissions constantly.

Because being able to modify the names in the group could impact access to the data, an approval process is often associated with adding new members to the group.

Until now, the authorization problem has been similar to other database systems. What gets more complicated is when there are high-stakes vertices that need careful version control and testing.

Merging Challenges

A typical scenario in PKG to EKG workflows is when informal note-talking evolves into formal knowledge bases designed to be used by a larger audience. A series of “Merge Conflicts” will naturally arise when this happens. If an author simply copies and pastes their page content into a shared graph, the links that used to work within your PKG may stop working. Although the links were consistent in your original PKG, a shared graph may not have access to the private concepts in your PKG.

After a user does the copy/paste, the editing tools might underline the broken links and suggest related links. If no matches are found, the user then has two options:

  1. Remove the link
  2. Copy the link destination page in your own personal graph

The latter option results in another problem. What if the new page you copied also has a set of broken links?

Moving knowledge from a private store to a shared subgraph is not trivial. It may be best to create the original document in the context of the shared group. Creating the new concept in a public subgraph avoids the broken links problem, but it adds a new challenge. What if your personal way of organizing knowledge diverges from how the group wants to organize knowledge?

Conclusion

The bottom line is that there are no easy answers to the private-to-public merge problems. Migration of knowledge between subgraphs with different access rules must always deal with the link consistency problems.

Role-based access control is a mature design pattern that scales well over a large enterprise. The key is to avoid assigning permissions to individual users. Tying users directly to any resource results in much higher maintenance costs as users move around the organization.

--

--

Dan McCreary

Distinguished Engineer that loves knowledge graphs, AI, and Systems Thinking. Fan of STEM, microcontrollers, robotics, PKGs, and the AI Racing League.