Making the most of your Connections

Chandrasekhar S
3 min readAug 24, 2020
Photo by Clarisse Croset on Unsplash

Dr. Warren Weaver (July 17, 1894 — November 24, 1978) was one of the pioneers of machine translation. In the year 1947, he first formulated the idea of using digital computers to translate documents into natural human language. This was even before people had imagined what computers could do!

Dr. Weaver was the first to propose three stages of development in scientific thought, Simplicity, Disorganized Complexity, and Organized Complexity.

The 17th, 18th, and 19th centuries dealt with Problems of simplicity, the ones in which two variables that are related to each other, say mass and acceleration, scientists at that time made huge progress in these problems.

During the first half of the 20th century became it evident that it is not two or three elements that need to be connected, but more variables in the universe are interdependent. Scientists and mathematicians tried finding connections between the variables, but these connections were disorganized and mostly random. These were the problems of disorganized complexity. In the current age of massive computing power, some of these problems can be churned to be simple ones.

The third category of Organized Complexity where we deal with problems like Black Swans, Pandemics, Markets Crashes (the highly improbable and unpredictable events that have a massive impact.) with many interconnected variables.

Knowledge representation is vital in solving problems of the second and third categories. Knowledge representation or in other words visualizing the complexity and relationships between the variables has evolved.

When we look back on the history of knowledge representations, the very early stages of expressing relationships were a hierarchical a top-down approach with God at the top. The Great Chain of Being is a hierarchical structure of all matter and life, the chain starts with God and progresses downward to angels, humans, animals, plants, and minerals.[1]

This idea of hierarchical representation transformed into a ‘Tree’ representation over time. Great representations of Knowledge Trees include Tree of Science by Ramon Llull [2], in which each science is represented by a tree with roots, trunk, branches, leaves, and fruits.

Today as we deal with the problems of Organized Complexity, knowledge representation needs to be more ‘Rhizomatic’ [3] than a ‘Tree’.

Humans are constantly trying to learn the unknowns. How do we learn something new? The answer is, we do this by connecting to something we have seen in the past or to something we experienced in the same context. This led to the transformation of knowledge representation as a Graph, which is nothing but the mathematical representation of things that are connected we call them nodes for entities relationships for links or edges.

How does the history of knowledge representation translate to the technology of modern times?

As machines start augmenting to our decision-making process, we need machines that can make decisions considering the context. Providing machines with data with the connected context bring efficiency in the learning and thus to the predictions. To feed the machine’s knowledge in a connected context is where Graphs becomes relevant. Graphs provide machine learning the contextual information that was missing to deal with the ambiguity like the relationships & the adjacent knowledge.

Relationships are highly predictive; we tend to ignore this critical factor. In machine learning terms, we need to start looking at ‘Connected Features’, which means a feature that is representative of the relationship. Now when you are presented with data that is naturally networked, from a feature extraction perspective, you could say “ I need to run machine learning on the most connected nodes”, which is easy than working with tables and indexes. Graphs also help to create new features, like the most influential nodes in the network, score it, extract it, and put it to the machine learning pipeline. Feature selection is another process where Graphs become powerful, like reduce the number of features used in the model.

Today we tend to take the data put in the tables and pump the data to the machine learning pipeline and by this process, we eliminate or abstract the most powerful element of the data, the relationships. Start making use of the relationships that are most influential and predictive.

Next time, you start thinking about storing data, think beyond storage, security.

When Everything is connected to everything else, for better or worse, everything matters - Bruce Mau

--

--