Alexa: Create an outline for me on graph databases

Dan McCreary
2 min readApr 8, 2018
Example of a wiki-generated concept network for Transhumanism by Caleb Jones

Here is a good example of how to quickly create an outline for any topic in Wikipedia using simple graph analytics:

http://allthingsgraphed.com/2015/09/16/what-is-transhumanism-wikipedia/

I like the way the author (Caleb Jones from Disney), broke the process down into discrete steps:

  1. Point your http client (mini web crawler) to any wikipedia page.
  2. Gather three levels of links (using a stop list to skip over reference pages). Note that if your http client supports XPath the query is just $page//a[not(@href=$stoplist)]. Simple is good!
  3. Put the links into a graph and filter out the links that have a lower Page Rank (inbound link count). The Page Rank function might even be an out-of-the box function in your graph library.
  4. Use a graph community detection algorithm or tool like Gephi to find “communities” of concepts.
  5. Tweek the clustering algorithms to get a reasonable number of communities (5–10 subtopics)
  6. Color code the communities and add labels to each community (something that is still a mostly manual process today)

The list of “labeled graph communities” provides a first level outline for your topic. You can repeat the steps above for each community to get a second level outline. Note that all of these steps are not yet fully automated. However, by breaking these steps down into a series of REST services I think they could be streamlined. This is an excellent example of how to quickly build concept maps to give you a broad overview of a new topic and show the relationships of this topic to other concepts. This can be done today using your own laptop/desktop without the need for a team of AI/Deep Learning/NLP researches and a rack of GPUs. Let’s not make this more difficult than it is.

My hope (prediction?) is that in a few years every database, search engine, word processor, smart speaker and ontology editor will have a “plug-in” that allows us to quickly suggest related concepts from concept these concept graphs. This should just be another variation of the MarkLogic suggest function.

One product I am using (Smartlogic’s Semaphore Ontology Editor) already has an API for a side panel widget for adding these “suggestions” [disclaimer: my wife works there]. These real-time suggestions can have a positive impact on the productivity of anyone building taxonomies and ontologies.

So if I am writing a whitepaper about graph databases (a real use-case) my word processor (or my presentation tool) should be able to suggest an initial outline for me. This should be as easy as saying: “Alexa — write me a white-paper outline on graph databases”.

--

--

Dan McCreary

Distinguished Engineer that loves knowledge graphs, AI, and Systems Thinking. Fan of STEM, microcontrollers, robotics, PKGs, and the AI Racing League.