Your cart is currently empty!
Data Schema of Relational Table Importing to Graph Database – Execution
Introduction In the earlier article titled Data Schema of Relational Table Importing to Graph Database – Dimensionality, we explored how…
In the earlier article titled Data Schema of Relational Table Importing to Graph Database – Dimensionality, we explored how to observe real-world data and transform it into a digitised format within a Relational Database. The greater the dimensions we account for, the higher the fidelity with which the Relational Database represents reality.
In this article, we are going to transform the data stored in the Relational Database , to the Graph Database.
While of course there are many ways for us to directly input the data into the Graph Database, there are still day to day scenario which we should input data into Relational Database before and transform them into Graph Database:
Relational Database is much more popular than Graph Database. Most of the systems in most of the companies stored their data in Relational Database or in tabular format.
While we can use Cypher to input the data directly into Graph Database, the user needs to go through a long learning curve before they can master the new computer language Cypher. To provide a commonly used Form-style Input interface for the user for their CRUD activities will encourage the user to engage the system.
The next question you may ask is : So why do we need to use Graph Database instead of Relational Database?
To answer this question, I will prefer to write in a trio of (1) Relational Database Problem Pattern – (2) Graph Database Solution – (3) Graph Database Data Schema such that all the 3 pieces of information will be interrelated, even though it may not map the content with the title of (1),(2) and (3) explicitly.
Table Citizen
Citizen# | First Name | Last Name |
---|---|---|
1001 | Barbie | Stereotypical |
1002 | Barbie | Weird |
1003 | Kenneth | Carson |
While table Citizen
is a typical relational data table which perfectly records the information (i.e. the 3 properties) of the 3 people (i.e. there are 3 records), can you imagine how to record the relationships among the records inside the same table?
For example, what if I want to record the facts:
May be you have throught of to append new columns is...of
and Target
at the end of the table Citizen
as below:
Citizen# | First Name | Last Name | is….of | Target | Date |
---|---|---|---|---|---|
1001 | Barbie | Stereotypical | Girl friend | Kenneth | Feb |
1002 | Barbie | Weird | Girl friend | Kenneth | April |
1003 | Kenneth | Carson | Boy friend | Stereotypical | Feb |
You will immediately realize that you cannot record both fact#1 and #2 at the same time. If you record #1 in Feb and modify it to #2 in March, you will miss the historical record of their relationship.
On top of that , you will also need to modify both Record#1001 and #1003 at the same time after they broke up as both #1001 and #1003 are in fact describing the same fact in different directions. (i.e. Stereotypical is Girl friend of Kenneth, and Kenneth is the Boy friend of Stereotypical), which we called this update anomaly in relational database.
Besides, also we cannot record both fact #3 and #4 at the same time because there is only 1 value able to be recorded in is...of
Column.
It seems that a normal tabular table cannot handle the fact descripting the relationship among different records inside the same table. This data pattern is regarded as Recursive Relationship.
In order to remedy the shortcoming, a new table, which is classifie as a Bridge Table, Love-Relationship
is necessary to be created to record the recursive relationships among the records in the same table Citizen
as below:
Love-Relationship# | Date | Subject | is….of | Object |
---|---|---|---|---|
2001 | Feb | Stereotypical Barbie | Girl friend | Kenneth |
2002 | Feb | Kenneth | Boy friend | Stereotypical Barbie |
2003 | March | Stereotypical Barbie | Ex-Girl Friend | Kenneth |
2004 | March | Kenneth | Ex-Boy friend | Stereotypical Barbie |
2005 | April | Weird Barbie | Girl friend | Kenneth |
2006 | April | Kenneth | Boy friend | Weird Barbie |
2007 | May | Stereotypical Barbie | Girl friend | Kenneth |
2008 | May | Kenneth | Boy friend | Stereotypical Barbie |
2009 | May | Stereotypical Barbie | Rival | Weird Barbie |
In Figure 1 which is built by a Graph Database, you can cleary and easily address all the facts #1,#2,#3,#4 mentioned previously. All fidelity is preserved via Graph Database.
You may realize that some of the relationships inside the Graph Database may be duplicated or redundant. For example, there is no need to record both directions of the Rival relationship between the 2 Barbies. While we call that Direction in Graph Database, this is not our focus in this article and we will leave the discussion in the later session of this paragraph.
Below is teh Graph Database Data Schema in Cypher, which creates the graph shown in Figure 1 :
// Figure 1 - Create nodes with Label: Citizen
CREATE (:Citizen {firstName: "Stereotypical", lastName: "Barbie"});
CREATE (:Citizen {firstName: "Weird", lastName: "Barbie"});
CREATE (:Citizen {firstName: "Kenneth", lastName: "Carson"});
// Relationships for February
MATCH (stereotypical:Citizen {firstName: "Stereotypical", lastName: "Barbie"}),
(kenneth:Citizen {firstName: "Kenneth", lastName: "Carson"})
CREATE (stereotypical)-[:Girlfriend_of {month: "Feb"}]->(kenneth),
(kenneth)-[:Boyfriend_of {month: "Feb"}]->(stereotypical);
// Relationships for March
MATCH (stereotypical:Citizen {firstName: "Stereotypical", lastName: "Barbie"}),
(kenneth:Citizen {firstName: "Kenneth", lastName: "Carson"})
CREATE (stereotypical)-[:Ex_Girlfriend_of {month: "March"}]->(kenneth),
(kenneth)-[:Ex_Boyfriend_of {month: "March"}]->(stereotypical);
// Relationships for April
MATCH (weird:Citizen {firstName: "Weird", lastName: "Barbie"}),
(kenneth:Citizen {firstName: "Kenneth", lastName: "Carson"})
CREATE (weird)-[:Girlfriend_of {month: "April"}]->(kenneth),
(kenneth)-[:Boyfriend_of {month: "April"}]->(weird);
// Relationships for May
MATCH (stereotypical:Citizen {firstName: "Stereotypical", lastName: "Barbie"}),
(weird:Citizen {firstName: "Weird", lastName: "Barbie"}),
(kenneth:Citizen {firstName: "Kenneth", lastName: "Carson"})
CREATE (stereotypical)-[:Girlfriend_of {month: "May"}]->(kenneth),
(kenneth)-[:Boyfriend_of {month: "May"}]->(stereotypical),
(stereotypical)-[:Rival_of {month: "May"}]->(weird);
The comparison between of the Dimensionality of the Data Schema between Relational Database and Graph Database are as below:
Dimension | Relational Database Data Schema | Graph Database Data Schema |
---|---|---|
1-D | Attribute (Column) | Node Properties |
2-D | Records (Row) | Node |
3-D | Table | Label |
4-D | Bridge Table | Type (i.e. Edge) |
5-D | Attribute in Bridge Table | Type Properties |
The Graph Database perfectly caters the recursive relationships between different records inside the same Table in Relational Database.
In fact, when we created a new Bridge Table, Love Relationship
Table in this case, you will find that both the names of Stereotypical Barbie
, Weird Barbie
and Kenneth Carson
had been shown up more than once inside the Love Relationship
Bridge Table, as well as duplicating with the records inside the Citizen
Table. (e..g you can find Stereotypical Barbie in both Love Relationship
Table and Citizen
Table)
This data duplication made the description about the reality lose its fidelity that while the database records Stereotypical Barbie
(and all other people) more than once, in reality there is only one Stereotypical Barbie
. There is discrepancy between the records (i.e. the Model) and the reality.
The Graph Database , on the contrary , records Stereotypical Barbie
once, which precisely describes the fact that there is one and only one Stereotypical Barbie
in reality.
Recall in the previous paragraph that a Recursive Relationship (or self-referential relationship) is the relationship betweens any of the 2 (or more) records inside the same Relational Table.
If Recursive Relationship is describing the vertical dimension of a relationship, (i.e. whenever you add a new record in a table, the length of the table will be extended vertically.) Functional Dependency , on the contrary, is describing the horizontal dimension of the relationship. (i.e. whenever you add a new column (i.e. attribute) in a Table, the length of that Table will be extended horizontally).
Functional Dependency is referring to a specific column (attribute) in a table ,is dependent on another column (attribute) in the same Table.
Let’s illustrate the concept Functional Dependency with the example table Citizen
in below:
Citizen# | First Name | Last Name | Gender |
---|---|---|---|
1001 | Barbie | Stereotypical | F |
1002 | Baribe | Weird | F |
1003 | Kenneth | Carson | M |
With common sense, we can infer by the attribute First Name
that Baribe
should be a Female, while Kenneth
should be a Male. We can say that the attribute Gender
is dependent on the attribute First Name
. (regardless of their Last Name
, of course).
This kind of dependency is called Functional Dependency.
Thanks to the transformation of SQL, the 2 most popular Relational Databases, MySQL and MariaDB , started supporting the SQL Keyword CHECK
after the version 8.0.16 and 10.3.10 respectively. By using SQL Keyword CHECK
, we can apply the functional dependency by adding the CONSTRAINT
in the SQL statement as Figure 2 in below:
// Figure 2 - Create Relational Table and associated Constraints
CREATE TABLE Citizen (
Citizen# INT PRIMARY KEY,
First_Name VARCHAR(50),
Last_Name VARCHAR(50),
Gender CHAR(1),
CONSTRAINT chk_gender_ken CHECK (First_Name = 'Ken' AND Gender = 'M' OR First_Name <> 'Ken'),
CONSTRAINT chk_gender_barbie CHECK (First_Name = 'Barbie' AND Gender = 'F' OR First_Name <> 'Barbie')
);
The compatibility of the SQL Keyword CHECK
is in fact a good move and big move in relational database world which makes our coding life much easier, until you realize that you have to hard code the constraints (i.e. the rules) into the SQL.
What if there are 10,000 known First Names in the world and I want to turn them all to the constraint?
Obviously it is extremely hard , if not impossible , for any programmer to hard code the constraint into the SQL statement, not to mention these 10,000 additional SQL CONSTRAINT
statements will significantly drag down the performance of the query.
Moreover, whenever an end-user of the system discovers a new First Name and wants to add it into the CONSTRAINT
fleet, there is no way for the end user to insert the new CONSTRAINT
as you will not expect him/her to write the SQL statement himself / herself. The Extensibility of the system is suck.
In order to cater the extensibility problem, how about if we create an additional lookup Table FirstNameGenderRule
to restore all the rules as below:
Rules# | First Name | Gender |
---|---|---|
3001 | Barbie | F |
3002 | Baribe | F |
3003 | Kenneth | M |
In this sense, every time before a new record is inserted into the Table Citizen
, a constraint lookup to the Table FirstNameGenderRule
will be triggered in order to validate the value in the Gender of the record. Whenever a new First Name
is found, the end-user can append a new record in this FirstNameGenderRule
Table to a new rule via the user Form.
While this method makes perfect sense and served the functional dependency as well as solved the extensibility problem of the system, the nature of this FirstNameGenderRule
is similar to the Bridge Table we have mentioned previously in this article, which the data redundancy is happened again due to the fact that the value of both First Name
and Gender
stored twice in both Table Citizen
and FirstNameGenderRule
.
Meanwhile, in the Graph Database, in order to cater both the functional dependency objective and cater the extensibility problem of the system, we come up a Cypher solution in Figure 3 below:
// Figure 3 - Create Rule Node
CREATE (:Rule {FirstName: 'Ken', Gender: 'Male'}),
(:Rule {FirstName: 'Barbie', Gender: 'Female'}),
(:Rule {FirstName: 'Sam', Gender: 'Non-Binary'});
// Figure 3 - Validation
MATCH (c:Citizen)
OPTIONAL MATCH (r:Rule {FirstName: c.FirstName})
WHERE c.Gender <> r.Gender
RETURN c.CitizenID AS ViolatingCitizenID, c.FirstName AS FirstName, c.Gender AS CitizenGender, r.Gender AS ExpectedGender;
Based on the 3 Nodes (i.e. 3 records) under the Citizen Label we have already created in Figure 1, simply create 2 Nodes for the 2 constraint rules. You can of course put these 2 Nodes under the Label FNGenderRule
to categorize the rule Nodes
.
In the future, whenever a new First Name is found, we can simply add a new Node under the FNGenderRule
Label and that is!
Once you run the WHERE
clause in Figure 3, all the unvalid entry will be filtered out. (and modify automatically by using the SET keyword if you want. But i will skip this part.)
Unfortunately , when we look closely at the newly created FNGenderRule
Node, we may realize that in fact the property of First Name
and Gender
still exists and is duplicated with that of the columns in Citizen
. We cannot fix the data redundancy in Graph Database either.
Maybe you think crazily enough like me to externalize every single properties to become a node. Nevertheless, even though you can write the Cypher to execute like we have done in Figure 4 below:
While in the first insight it sounds we have served the functional dependency and extensibility requirements without sacrificing the data redundancy, in fact we have just created another worm-hole.
The question we should ask : how do we interpret from Figure 4 that {Kenneth, Carson, Male} is in fact the property of a Citizen
?
Unfortunately, there is no way for us to link up the Nodes Kenneth
, Carson
and Male
. Even though if you can, you may spend much more time than the benefit you gaining from solving the data redundancy problem.
And therefore, you keep optimizing your data model by using the Citizen#
as the Label of each leaf Nodes such that you can filter out a specific Citizen
based on the Citizen#
inside the Label like Figure 5 in below:
While technically and theoretically it is feasible, before you model your data in this way, think about what if there are 1,000,000 citizens, and what if you need to update/delete/modify the value of a record? And what if Kenneth changed his name from Kenneth to Ken? You need to firstly add a new Node Ken and then delete the Node Kenneth.
This operation is just not worth compared with the benefit brought from the data redundancy.
The Table is the fundamental components inside a Relational Database which stores data in tabular format using Column and Row to coordinate a specific Value (i.e. the Cell). Form and Table (i.e. Tabular format) are everywhere in your daily life.
A data schema in a Relational Database is a blueprint or structure that defines how data is organized, stored, and related within the database. It describes the database’s logical design and covering when and what to create Table, Columns , Index, Contraints, Relationship and Data Type inside a Table in a Relational Database
A graph database is a type of database that uses graph structures to represent and store data. Instead of organizing data into rows and columns like relational databases, graph databases focus on relationships and connections between data points.
While Graph Database can be acting as the Enterprise Knowledge Base which can speed up the learning curve of both staff and client , how to import the data into the Graph Database is a challenge.
As data with Tabular format is dominant in the world, it is inevitable for us to import data from a Relational Database to a Graph Database. There are few ways that we can import the data into the graph database:
While graph databases are not common in the general public, it seems that both of the methods required some kind of expertise in order to get the job done. Besides, the methods above are good for batch importing. If you want to import the data in piecemeal, it is not handy.
It is necessary to have a normal input Form (just like the Form you can see every day in your life) which has zero learning curve for the user to import the data manually into the graph database.
The wording “Data Type” has a different meaning. it can be referring to the description of the data format , e.g. a column in a Form which can only input a value of Integer , decimal ,text, or a autonumber.
Instead, Data Type has another meaning which is used to classify data as Business Data, Metadata and Model Data. For details about the difference between the 3 , it is strongly recommended you to read the article bGraph Architecture – Model Data beforehand.
The focus of this article is for catering the latter.
Before we dive into the problem pattern and its paired solution on importing the data from a relational table into a graph database, the concept of interpreting data in terms of dimension is crucial for us to understand what and why we are solving the problem in the way we did.
By definition, In physics, a Dimension is a space that can be measured and extended. For example:
Dimension (x-D) | Example of Unit | Example in Reality |
---|---|---|
0-D | N/A | a point (or a Spot) |
1-D | cm | a Line |
2-D | cm2 | a Plane (or an Area) |
3-D | cm3 | a Volume |
4-D | Hour | Time |
5-D | ?? | ?? |
A practical purpose of the concept “Dimension” is to coordinate something.
For example, can you tell me where the alphabet “A” is in the Line below?
______A___
While the “A” is definitely not in the middle, it is not in the starting or ending point of the line either. You can say it is right-stewed , but cannot exactly tell.
How about now?
123456A89
Now you can confidently say that the “A” is located in the position “7” of this 9-unit-long Line.
The Line is regarded as “1 dimension” because you can measure the length of the line (1-9) in width (and only width) , but not in height or in depth or in time.
Dimension | Co-ordinate of Data Point Example |
---|---|
0-D | N/A |
1-D | {1} |
2-D | {1,2} |
3-D | {1,2,8} |
4-D | {1,2,8,10} |
So it is quite easy to understand that the amount (or you can say the “digit”) of data points is exactly the same as the degree of dimension. Whenever you want to coordinate an additional dimension, simply add an additional data point (i.e. the digit) into the array.
In mathematics , we call this long number and extensible array (e.g. {1,2,8,10}) as Vector, which is the concept under the mathematics branch of Linear Algebra.
While as a human being we cannot imagine a material object in a 5-D physical world, there is no limitation on how many dimensions you can add in mathematical world. In fact , you can add 100 , 1,000 or 10,000 data points into the Vector if needed, as long as you can think of the use case (e.g. to coordinate something) and provided that you have sufficient computing power to do the vector calculation and operation.
That’s the reason why even the Graph Theory coined almost 300 years before, and computer existed almost 90 years, we still didn’t hear of Graph Database until this 1 or 2 decades (may be because you were not even born!) due to the limitation of the computational power and infrastructure.
As mentioned before, the main purpose of a Vector is to coordinate a data point. In fact , coordinating something is the initial step of “Searching”.
What is the benchmark of whether we can successfully coordinate something mostly depends on whether the coordination can refer to one and only one outcome (i.e. singleton)
Allow me to start with an example based on 2023 Barbie film.
Imagine in a country ,Barbie Land, Barbie1 is the one and only one citizen. To record Barbie as the citizen of this country (most likely Barbie will be the one who carries out this recording task!), Barbie can simply record like below:
Barbie |
That is!
While there is one and only one citizen in this country, every time anyone talk about “Barbie” in this country, the word “Barbie” can uniquely identity the material substance of the person who is referring to , which means it can perfectly and good enough to “coordinate” the one and only one material substance citizen “Barbie” in Barbie Land.
You can regard this “Cell” as 0-Dimension because there is one and only one value, and there is no any direction for you to extend horizontally or vertically. (Remember the definition of Dimension?)
0-Dimension : In terms of Data Structure, only one and only one Cell which stored the Value.
When someone outside Barbie describes Barbie, he/she may say that Barbie is 29 cm height , the citizen of Barbie Land. If height = 29cm , citizenship = Barbie Land, then X = Barbie , what is the “X”?
Obviously the X = Name.
In order to faciliate others to describe the material substance object (i.e. Barbie herself), we can add an additional Cell on top of the Cell Barbie as below.
Name |
---|
Barbie |
By applying the definition of Dimension, you can realize that the Cell (i.e. 0-Dimension) had been added an additional Cell Name
and turn it to a vertical line (i.e. a Column). Therefore, the data structure had been transformed from 0-Dimension to 1-Dimension.
From Now on , you can call this data structure as Column.
1-Dimension : In terms of Data Structure, when a “Column Name” is appended on top of the 0-Dimension Cell, it can be regarded as 1-Dimension.
One day, when Barbie experienced the imprefection of herself , she realized that she is in fact only a stereotype. She therefore gave herself a Last Name as below:
First Name | Last Name |
Barbie | Stereotypical |
When a column Last Name
is added, the word “Name” is not enough to differentiate between the two. And to uniquely identify the two , the “First” and “Last” are added in front of the “Name”.
You may find the pattern that whenever you want to uniquely identify some objects, simply attach some attributes (e.g. Last Name) to that object, in order to let the Name refers to one and only one instance. (This is the definition of the word “Definition”!)
Back to our database use case, as now an additional Column is attached to Barbie, you can see the data structure is in fact extended from a Column (1-Dimension) to a Plane (2-Dimension). If you remember the vector example, you can write it the vector as {Barbie, Stereotypical} .
From now on, you can call this Plane as Table.
2-Dimension : In terms of data structure, whenever there are 2 columns with a Column Name and 1 Record, it can be regarded as 2-Dimension.
As time goes by, the population of the Barbie Land increased 100% from 1 person to 2 people , of which the newbie is also named Barbie.
First Name | Last Name |
---|---|
Barbie | Stereotypical |
Barbie | Weird |
Applying the same logic of the definition of Dimension, although an additional record , i.e. Barbie Weird, is appended to the list, the data structure is still in 2-Dimension as the new record by itself does not extend to any new direction. (it extended vertically which this direction already existed before.) That means no matter how many records you added, the Table is still in 2-Dimension.
Although logically anyone can classify 2 Barbies with different First Name
and Last Name
, in reality as the 2 Barbies will not seal their name in their forehead, there is no way to teach a person who never see them before to differentiate between the two.
Having discussed with this problem, in order to uniquely identify themselves visually, the 2 Barbies agreed to give a Hair Style
describe to themslves as below:
In this case, a obversable character is needed to enrich the data structure, letting the information inside the data structure more close to the reality.
First Name | Last Name | Hair Style |
---|---|---|
Barbie | Stereotypical | Floating |
Barbie | Weird | Quirky |
Applying the same logic of the definition of Dimension, although an additional column , i.e. Hair Style
, is appended to the Table, the data structure is still in 2-Dimension as the new Column by itself does not extend to any new direction. (it extended horizontally which this direction already existed before.) That means no matter how many Columns you added, the Tabel is still in 2-Dimension.
It seems the 2-Dimension Table can describe the reality of the Barbie Land well until both Barbies want to create a sunglasses list which can help her to manage all your eyewear in their wardrobe.
After hours of effort , they created the sunglasses list as below:
As a new Data Table is created, here raised another question : How to uniquely identitiy the 2 Tables during communication?
Well, as you can think of, simply giving the name to each of the 2 Tables can solve the problem.
Sunglasses# | Sunglasses Style | Sunglasses Frame Color |
---|---|---|
S01 | Cat-Eye Frame | Orange |
S02 | Aviator | Deep Blue |
S03 | Sporty | Silver |
Therefore, Stereotypical Barbie name the 2 Tables as Citizen
and Sunglasses
respectively.
Same as your life, the 2 barbies just open a can of worms. While both Citizen
and Sunglasses
by itself are 2-dimension objects (i.e. a Table), listing out 2 Tables , referring to the definition of Dimension, will create another new Dimension (2 Tables are definitely measurable and you can extend as many objects (i.e. Tables) as you want in the same direction).
And therefore, a 3-Dimension data structure is just born! Now you can call this 3-Dimension data structure as Database.
It’s time for us to take a deep breath before we dive into the 4-Dimension data world. Limited by the human brain’s structure, any dimension higher than 3 will be hard to visualise and project in our brain. Therefore we have to make sure we are acquainted with the 0-3 Dimension concepts well.
The next question is, how do we prove that our 0-3 Dimension data structure theory is correct and practical?
Let us back to the basics of why we have to build the data structure. This is the fundamental objective of the information system. The logic is as below:
To streamline the deduction steps above, you can conclude that the purpose of the Information System is that the user of that Information System (i.e. the Database) can simply make the decision by observing the world, leaving all the processing and analysing steps to the Information System.
And hence, whether or not the database is competent or not depends on whether the data stored can reflect the reality in fidelity (i.e. is it descrptive enough) and uniquely identify the underlying objects found with the real world (i.e. co-ordinating) , while leaving the inferential duty to other system. (e.g. an AI facial recognition system)
It is straightforward that we can test the fidelity of the database by asking simple questions in below:
Q1: How many citizens are there in Barbie Land?
A1: Two.
By counting the number of records (i.e. Row) in the Table
, we can easily figure out there are 2 records in the Table Citizen
Citizen
.
The 1-Dimension data structure (i.e. any of the Column inside the Table) perfectly performs the task on description the objects in reality, and hence it can be regarded as a competent database.
Q2: Please describe the citizen in Barbie Land whose last name initialized with “s” ?
A2: Steretypical Barbie is the citizen in Barbie Land who has a floating hair style.
By filtering the Last Name
column in the Table Citizen
, you can easily describe the object (i.e. the record) by finding out its First Name
, Last Name
and Hair Style
.
This 2-Dimension data structure (i.e. Database) perfectly performs the task on description and co-ordinating the object in reality, and hence it can be regarded as a competent database.
Q3: What object types can be found in Baribe Land? and how many instance in each object?
A3: 2 Citizens and 3 Sunglasses can be found in Barbie Land.
By enumerating all the Tables inside the Database, we can easily answers the question Q3.
This 3-Dimension data structure (i.e. Database) perfectly performs the task on description and co-ordinating the object in reality, and hence it can be regarded as a competent database.
You can see that up to now , the existing 1 to 3-Dimension data structure perfectly answers the 3 questions above, until we start asking another type of questions : 4-Dimension question.
Let’s start the 4-Dimension session with a question and you will realize the limitation of the 3-Dimension data structure.
Q4: What is the wearing habit of the Citizens in Weekday?
While we already have the Table Citizen
and Sunglasses
to record each individual citizen and sunglasses, these 2 Tables do not describe the Relationship between the 2.
If we tried to record the mix and match wearing observation in existing Tables, no matter which Table (Citizen
or Sunglasses
) , it will look like this:
Weekday | Sunglasses# | Sunglasses Style | Sunglasses Frame Color | Citizen |
---|---|---|---|---|
Monday | S01 | Cat-Eye Frame | Orange | Stereotypical |
Monday | S03 | Sporty | Silver | Weird |
Tuesday | S02 | Aviator | Deep Blue | Weird |
Tuesday | S03 | Sporty | Silver | Stereotypical |
The Table above sacrificed the fidelity the reality provided in answer of Q1 (i.e. How many citizens are there in Barbie Land?) as some of the citizen are duplicated in the records such that we cannot simply count the records to answer the question. (i.e. There are 4 records in total but in reality it got only 2 sunglasses and 2 citizens)
In order to solve the problem, instead of appending the additional columns into any of the existing Tables , it is a better way to create 2 additional tables – WeekDay
Table
Weekday |
---|
Monday |
Tuesday |
Wednesday |
Thursday |
Friday |
Together with the Mix and Match
Table that we just created:
Weekday | Sunglasses# | Sunglasses Style | Sunglasses Frame Color | Citizen |
---|---|---|---|---|
Monday | S01 | Cat-Eye Frame | Orange | Stereotypical |
Monday | S03 | Sporty | Silver | Weird |
Tuesday | S02 | Aviator | Deep Blue | Weird |
Tuesday | S03 | Sporty | Silver | Stereotypical |
Now there are 4 Tables in Total inside the Database:
Citizen
Sunglasses
Weekday
Mix and Match
However, when you observe carefully, you can realize that in fact the terms Mix and Match has no records and does not exist in reality. Instead, the concept Mix and Match is a concept (i.e. a Relatoinship) rather than a material substance which can be observed.
If you still have no idea what i am talking about, let’s recall the table Citizen
that we have created during the 2-Dimension session:
First Name | Last Name | Hair Style |
---|---|---|
Barbie | Stereotypical | Floating |
Barbie | Weird | Quirky |
Can you realize that you cannot exactly find any material substance of “Citizen” inside the Citizen
Table? It is because the terms Citizen is a Class , while the 2 Barbies are records are the Instance. There is no material subtance for the term Citizen.
Same phenomenon happended in other Tables that you can only find the Instance instead of the Class inside the records in any tables.
If you think this logic make sense and apply to the table Mix and Match
, you may now realize that the concept Mix and Match is a Class instead of an Instance which cannot be found inside the record of the Table Mix and Match
.
As we just created a new Class Mix and Match to consolidate all the 3-dimesion Tables Citizen
, Sunglasses
and Weekend
, that means another new Dimenion , 4-Dimension, had been created.
If we visualize all the relationships of all concepts mentioned throughout the paragraph via the Entity Relationship Diagram, it will become the diagram in below:
So, it is possible that i can build infinity number of Dimensions of the data structure inside the database? Yes, in theory you can. Whenever you cosolidate all the instance in the same Dimenion and form a list, you created a new Dimension.
As long as you understand how we use the dimensionality in the data structure to describe the real world, we can stop the example in 4-Dimension.
In this section we just introduced how we observe and describe the reality and fit them into a Relational Database via a dimensional way, as well as how we create a new dimension by gathering all instance in lower dimension to form a new class in the upper dimension.
In next article, we are going to address the problems that the Relational Database suffers, and how we fit the Relational Database into the Graph Database to compensate the problems suffered by Relational Database.
bGraph is an SaaS developed in-house by Diamond Digital Marketing Group which can be categorised as a GraphRAG web application serving as an Enterprise Knowledge Graph.
To better understand what GraphRAG exactly is, it is imperative for us to start with a real world problem pattern
The definition of profit is simply Sales Revenue Minus Cost. While in the legal aspect, an cost item of Labour Cost (e.g. Salary) is good enough to meet the legal duty in terms of financial report, it cannot be in the real world reflects the problem that how the 40 Hours x 4 Weeks working hour for a staff is distributed among different activities throughout his/her daily operation. Instead of presenting the cost in monetary terms, I would like to convert it into Time.
In almost any industries or any business model we can categorize the types of time cost in below:
In a service-oriented business, Production Time Cost is simply referring to the time a staff spent on rendering a service to the client. For example, a hair dresser spent 30 minutes on providing hair styling service to the client. This 30 minutes will be categorized as Production Time Cost.
In a sku-oriented business, Production Time Cost can be categorized as the time on any kinds of labour cost incurred between planning to product delivered to the client. For example, even if you sell a Clock online, not only the Time Cost the Product Manager should be spent on designing and manufacturing the clock, the Customer Service Officer also needs to spend time on answering the enquiries from the wholesale or end-user clients.
Communication Time Cost is indispensable in business world. We can easily find the Communication Time Cost in scenarios below:
Searching Time Cost can be derived from following scenarios:
Error Time Cost can be derived from following scenarios:
Error Time Cost can be derived from following scenarios:
You can imagine that among all these Time Cost, only a very small portion of Time Cost is observable and measureable. The Time Cost which is not observable and measureable can never be cut or minimized.
There are 3 directions on handling the Time Cost
Directly and brutally cut the item derived the cost. For example, streamline the workflow from 10 steps to 9 steps
After years of hands-on experience (this is a black box and don’t ask me how and why I know! Thsi is an human intelligence before artificial intelligence dominates this world), you will realize the application can take 3 steps (or directions , to be precise) to handle all the time costs mentioned above:
By observing and modeling the world, you can address the relevant factors , steps, components, concepts that are related to your business.
For example, when you are running an e-shop, you will realize different kinds of transactional emails or reports which will reflect the reality. This procedure is called Modeling.
Modeling of an eshop Purchase Cycle:
While the concept is easy to understand, it is extremely difficult to execute as you have to decouple each procedure, workflow, and concept into an executable encapsulated module which you can reuse or execute systematically.
On top of that, it is a challenge for a Business Analyst or System Analyst to observe from the reality to refine the related components which comprehensively describes the model of the business. We called this comprehensive scenario Sample Space.
For example, while every one will understand the concept “Client”, a Business Analyst have to decouple the concept “Client” based on following attributes in order to make it executable and more close to reality:
While the comprehensive option value lists of some of the attributes can easily be enumerated , most of the time most of the option value lists for most of the attributes cannot be enumerated in the time spot in which the system is built. More close to reality is that these option value lists, or even the attributes itselfs, are “growing” organically from time to time instead of being addressed in the very beginning.
For example, even the option value of attributes Gender can be classified as Male , Female in old days and an additional option value Transgender nowadays.
Also , what to observe in reality, and whether you think the component is relevant to your business or not highly depends on the level of knowledge of the Business Analyst. For example, in 4-year old you regarded water as water. But in 14-year old you should have realized the water can in fact be further decoupled by 1 H (Hydrogen) and 2 O (Oxygen).
While we will not dive into the problem patterns that we suffered during the enumration process, enuermation by itself is the very beginning of the GraphRAG based Enterprise Knowledge Graph web application.
In a techncial stack, we normally have the following technical components to execute the Enumeration process:
Indexing is the procedure to facilitate all the things or concepts in the Sample Space to be stored and searchable.
For example, giving a Sales Order an Order Number (e.g. SO20323) is a common and easy way to “Index” an Sales Activities (and conceptualized via the document Sales Order).
However, not anything can be easily indexed as simple as a Sales Order.
While i am not going to dive into the problem patterns we suffered in the indexing procedure, we can in high level describe some of the indexing procedures for a solution application:
Mapping is simply to find the relationship between 2 concepts. The challenge task is that you need to address which relationship is relevant to link up among tens of thousands of combinations. For example, when a customer service office asked the client to provide the Client ID# , the client forgot his/her Client ID# and simply provide a mobile phone number for the customer service office to lookup what the Client ID# is.
In the above example , Client ID# and Client Mobile Phone Number is easy to map due to the fact that most likely that the Client Mobile Phone Number is linked to the Client Table itself. However , this ease does not apply to everything.
For example, how can you figure out an Facebook Username “BillGates” is in fact referring to the same person in Instagram Username “ThisisBillGates” as they are using different wording to refer to the same object (person)?
As usual, while we didn’t dive into details, we describe the techncial stack it normally be applied to do the mapping:
Once all the concepts are enumerated and indexed , and the relationship among each concepts (we called it “Node“) are well defined and connected, we can start the Searching step.
In fact, regardless of industry, job nature, role, task , business model, anything, as long as you ask a question, you are performing a “Search” activity.
To execute a “Search” is to “Find a Needle (an instance) in Haystack (a pool of instances)”
For example, when the customer service officer received the an enquiry from the client asking “When my Sales Order Delievered”, he then carry out the steps below:
While the previous example is happened in a customer service scenario, the “Search” pattern also happens in the production team.
In fact the search theory deserves a whole book to elaborate. We will skip the theory and directly highlighted the technical stack that we are going to use to carry out the Search Function:
In conclusion, the GraphRAG SaaS Enterprise Knowledge Graph We Application is a solution backed by Enumeration, Indexing and Search functions which can help any individual or organisation to save time on Production , Searching , Error Handling and Communication cost.
In order to visualize the power of the GraphRAG Enterprise Knowledge Graph, allow me to demonstrate with a real-world day to day example in digital marketing world.
While Customer Service Chatbot is for sure one of the powerful aspects of saving time cost, I want to put the focus on another more important point which can be bought by the GraphRAG solution. Therefore I will keep the Customer Service level Chatbot description minimal.
Besides, i will also skip all the description regarding some automation in programmatic (i.e. not AI) level. For example, Email auto forwarding with hard coded conditional logic based on the Email Title via Email API whenever an new Email received.
A Individual Client , John, who owned a WordPress website created by us (DDM Group). He received the Spam Contact Us Form Email daily, which he made him feel annoyed. As we (DDM Group) is John’ website adminstrator, he complained to us.
Although John received the email because of the Spam bot of the Contact Us Form Submission in his website, he does not realize the fact and his complaint email is as below:
Hi DDM,
My email keep receiving rubblish Email daily. Please help to fix.
In most of the time, the client or end-user , like John, can only use human daily language, instead of technical jargon , to describe the problem they faced.
And most important is that, the event which trigger the action (e.g. write a complaint email) is normally come from a Symptom which drive his emotion (e.g. Fear , Annoying , Despair). Most of the time this Symptom is not the cause of the problem and instead is the consequence by itself.
For easy communication, we named this “Trigger” as Symptom.
Symptom = Rubblish Email
Party Involved = Client
By asking John to submit the Rubblish Email to the Google Drive, the Technical Support analyzed that the Email is infact triggered by the Contact Us Form Submission in the existing Website.
And therefore, the Technical Support classifed it as a “Spam Form Submission” Problem Pattern which was already reported by different client many times and therefore is named and indexed as “Spam Form Submission“ Problem Pattern in our system
Problem Pattern = Spam Form Submission
Party Involved = Technical Support
As stated before, the Spam Form Submission is a well-knowned Problem Pattern , as an experienced website administrator, DDM Group had already indexed and mapped different kinds of Solution for different scenario with the same Problem Pattern.
Following factors affecting the choice of the Solutions:
In order to identify the Client Contract Amount , Web Server as well as the SMTP Sending Server specific to John’s case, the Technical Support and the Customer Service Officer have to access the CRM , as well as Website Development Production Database to lookup the Client Contract Amount , applied SMTP Sending Server and Web Server.
After lookup, we figureed out that John is a VIP Client and using GMAIL API and Cloudflare.
The Technical Support, based on his years of experienced and know-how in cyber security knowledge domain , realized that the main reason of the Submission Contact Us Form being spammed is due to the fact that some kinds of Spam Form Bot constantly crawled John’s website and realized the website is using some popular open-source plugin which triggered vulnerability exposure, leading the Spam Form Bot can easily fill in the form in John’s website automatically. In order to stop being spammed , the best way which the Technical Support can think of is to let the Spam Form Bot cannot even reach John’s website. And hence the Solution of Server Level WAF (Firewall) installation is chosen due to the fact that the Cloudflare Proxy Server supports WAF Firewall.
Environment = VIP Client , GMAIL API, Cloudflare
Party Involved = CRM Manager + Techncial Support
Solution = Server Level Firewall (WAF)
Party Involved = Technial Support
Once the Solution is confirmed, the case is passed to the Account Manager (i.e. Salesperson) to follow up and explain to the client.
While the Technical Support does not quite familar with the SKU Name in the SKU Library, he suggested to the Account Manager to visit the SKU Library in DDM Group and search for search term “Cloudflare WAF”
The SKU Library come out with the following SKU#
SKU Name | SKU# |
---|---|
Cloudflare WAF – Standard | 5232323 |
Cloudflare WAF – Premium | 5232345 |
Wordfence (WordPress Firewall) | 8475623 |
Due to the fact that the SKU name by itself cannot faciliate the decision on which SKU to be chosen to solve the problem, the Technical Team further dives into the SKU Feature of the 2 Cloudflare related SKU and realized that only Cloudflare WAF – Premium (#5232345) supports the Legitimate Bot whitelisting feature.
SKU = Cloudflare WAF – Premium
SKU Feature = Legitimate Bot Whitelisting
Party Involved = Account Manager
As an seasoned and proactive Account Manager, he realized that it is a good opportunity to upsell another SKU to John due to the fact that the Spam Submission Form alerted him for the cyber security concern.
The Account Manager googled the knowledge and figured out that Login Attempt Attack (i.e. a Problem Pattern) is another common vulnerability which is suffered by lots of eshop like what John is running.
The Account Manager , based on his experience, believed that Fear is a good sales trigger to have the intention purchase. In this sense, he told the potential risk of being login attempt attack by the malicious bot and suggested John to install another 2FA plugin which can effectively protect the unauthorized login.
As the Account Manager that John have no idea on what a Login Attempt Attack is , he visualized the problem pattern by showing the visiting report which logged thousands of visits of the login page of John’s eshop with an hour.
John felt worry about it and took the advice and purchased the SKU of 2FA Plugin Installation for WordPress, while the Account Manager successfully upsell a SKU related to John case.
Target Audience Property = Eshop owner
Sales Trigger = Fear
SKU = 2FA Plugin Installation for WordPress
Problem Pattern = Login Attempt Attack
Sympton = Thousand of visits in Login Page in an hour
Party Involved = Account Manager + Client
Once John signs the Sales Contract, the Sales Contract with the involved SKUs is passed to the Production Manager.
While the Sales Contract enumerated the SKU name , it does not limit which plugin to use in order to deliver the SKU.
Having checked with the Plugin Library regarding the error and bug reports for each plugin, the Production Manager decided to use the plugin WordFence for the 2FA related SKU and Cloudflare for the Server WAF related SKU.
SKU = Cloudflare WAF – Premium
Plugin = WordFence
SKU = 2FA Plugin Installation for WordPress
Plugin = Cloudflare
When you put everything together, you may realize in fact you are doing the following steps:
In the real business world, there are thousands of factors (i.e. columns) that can be addressed, with each factors may have thousands of option value involved (e.g. a Sales Order Records), forming a infinity number of nodes and edges of a Graph, which can be only comprehensively memorized and handed by machine.
I hope you can understand the problem pattern involved in the real world and realise that the learning activity of a human being is in fact based on enumerating , indexing , mapping and searching.
By applying the GraphRAG Enterprise Knowledge Graph SaaS Web App (i.e. bGraph), it can automate and speed up the learning of a human being based on following open-source technical stack:
The objective of this article is to provide a blueprint which demonstrates and enumerates all the technical stacks used to build the bGraph.
Although using a Graph Database is a perfect tool to illustrate this kind of blueprint, ironically, we cannot use the Graph Database to demonstrate how to build a Graph Database because the Graph Database is not yet built.
bGraph is a DDM terminology , which is assigned as the name of the Enterprise Knowledge Graph (EKG) built on top of Graph Database. You can regard the bGraph as a Knowledge Management System in DDM Group which consolidates all types of data , including Business Data, Meta Data and Model Data, into one place forming a supreme intelligence to answer any questions raised by either Clients or Staffs.
Architecture of bGraph refers to all the technical stacks used to build the bGraph, as well as specific tools that we adapted for building the bGraph. You can regard it as a blueprint of the bGraph
While there are many components which can be found in the bGraph Architecture, this article is focused on the component of Model Data. The best way to understand Model Data is to compare the Model Data with Meta Data and Business Data.
In the Database world, no matter what business it is running , Data can be classified as 3 categories :
The data which reflect the business activities. For example, in an eshop a Watch is sold out to the Client named “Tony” with the price of USD$42, the “Tony” and “USD$42” will be regarded as the Business Data.
In an Excel File, you can regard the Column Name as the Model Data , while each record under the same column as Business Data. For example, If you have an Product Price List
an Excel File as below:
Product Type | Product Price (USD) |
---|---|
Watch | 42 |
Shoe | 30 |
The Column Names Product Type
and Product Price (USD)
are regarded as Model Data, while the records [Watch,42]
and [Shoe, 30]
are regarded as Business Data.
It also means “Data of Data” , which the function of Meta Data is to describe the Model Data. With the same example, as an eshop webmaster, before you can sell the product in the eshop, you should have input the price of the product Watch
in the Price Field
in the backend of the Eshop. Instead of a Text String (i.e. US Dollar Forty-Two) in Data Type, you will expect the Price is filled in Number Format (i.e.42). In this case, the Numer
(instead of Text String
) is the MetaData which describes the Data Type of the Price
field.
In a relational table (e.g. a Sheet in Excel File), you can regard the Column Name (or a Field Name
in an Form) itself as the Model Data , while each record under the same column as Business Data. For example, in Figure 1 – Product Price List in previous paragraph, The Column Names Product Type
and Product Price (USD)
is regarded as Model Data, while the record (i.e. the value of the cell) of [Watch,42]
and [Shoe,30]
are the Business Data.
It is imperative for us to differentiate the 3 categories of data due to the fact that different types of data are intertwined in our communication during the bGraph development cycle.
Model Data can narrow the discrepancy between the Reality and Model in following aspects:
It is very common to find both the CRM and an Accounting System in any scale of the companies, which means if you want to insert a new record of the First Name
and Last Name
of a Client, most likely that you have to record it twice in both the CRM and Accounting System.
While in reality the Client shows only once, in the Model it has shown twice in both CRM and Accounting System even though the 2 records in different systems are in fact referring to the same Client, meaning that the Model Data – First Name
and Last Name
of the Client, are duplicated.
This discrepancy between the Reality and Model lessens the fidelity of the Model.
Model Data is here to kill the discrepancy through duplication.
In a traditional system development cycle, the Reality is being observed once in a particular time spot (most likely in a brain-storm sales meeting ) and this observation will be transformed to an Model, which most likely is presented by an Entity-Relationship Diagram, by the System Analyst.
However, soon or later this System Analyst realized that it is not the case. While in reality the business environment is ever changing, the observation to the Reality, as well as to Modeling the observation, become streaming tasks instead of batch tasks, meaning the observation and modeling tasks should be done continuously in agility, instead of only did once in the very beginning of the system development meeting. We called this concept as DevOps.
Let’s illustrate the example by the Table in below:
Time Period | System Name | Properties (i.e. Field) |
---|---|---|
Year 1 | CRM (built in house) | Client.FirstName Client.LastName Client.Birthday (DDMMYYYY) |
Year 2 | Accounting System (3rd party Saas) | Client.GivenName Client.FamilyName |
Year 3 | Eshop (built in house) | Client.FirstName Client.FamilyName Client.Birthday (YYYYMMDD) |
In the infancy status (i.e. Year 1) of a startup company which you are working for, it makes sense to prioritise building a CRM system instead of an Accounting System in order to generate leads and sales revenue before complying with legal bookkeeping and auditing requirements. In a CRM system, your colleague Anna, as a system analyst, can easily observe from the reality that there should be the properties First Name
and Last Name
attached to a Client. The system analyst (Anna) therefore put the First Name and Last Name in our Model in below:
First Name |
Last Name |
It works perfectly until in Year 2, after there were quite a lot of sales orders made in Year 1, it is inevitable for the business to have an Accounting System to cater both the bookkeeping and invoicing tasks.
Due to the fact that the System Analyst , Anna, who built the CRM system in Year 1 had already quit, instead of inventing its own wheel, your boss in Year 2 decided to subscribe to a SaaS of a canned Accounting System, which has comprehensive functions catering all the bookkeeping and invoicing needs of the company.
Everything works fine until a fresh grad junior Sales Executive , Ann, who is instructed by you to find out a Client with ID# 302392 in CRM in the historical sales invoice report in the Accounting System. After Ann checked out the CRM through the ID# 302392 and the system showed the First Name
and Last Name
of the client as Joan Lee. Ann tried to put the First Name and Last Name Joan Lee to the Accounting System to generate the sales order report.
Unfortunately , after 60 minutes of effort, Ann failed to find out the fields First Name
and Last Name
to filter the sales order in the Accounting System , she requested help from her supervisor, which is you.
After you listened to the question raised by Ann, you are astonished that she did not even realize First Name is a synonym of Given Name and Last Name is a synonym of Family Name. (Please refer to Figure 2 – All System Development Timeline in previous paragraph)
Although you are frustrated , you still gently explain the truth to Ann, which took you another 15 minutes.
Therefore, all in all the company had spent 75 minutes on simply communication and education, which these communication and education costs will not be the last time to be incurred due to the fact that the new fresh grad employee which the company is going to hire in the future is very likely to encounter the same misconception.
Therefore, a centralized library which explain the relationships between any properties of any systems throughout the company is on demand.
The story did not end. With great difficulties the company still survived in Year 3 and would like to expand the business by running an Eshop online for overseas markets.
Your company hired another System Analyst , Joanna, to build the eshop which she had completed the project at lightning speed. After 100 new client registrations in the Eshop, when you want to import these 100 client registrations from the Eshop to the existing CRM, you finally realized that the field Birthday
of the CRM is in format (i.e. the Meta Data) of DDMMYYYY , while in the Eshop the format is in YYYYMMDD.
Due to the fact that you realized all the Date related fields throughout the systems of your company are in YYYYMMDD, which is contradict with DDMMYYYY format usually used in your country, you have no choice but to request the newly hired System Analyst Joanna to spend another 1 months (i.e. 22 Working Days!) to turn all the Date related Fields in the Eshop from YYYYMMDD to DDMMYYYY in format.
By studying the example above, we realized that a centralized library (i.e. the repository) which stored all the Model Data (and its associated Metadata) will definitely help the System Analyst to avoid all the mistakes mentioned above by checking all the existing properties (i.e. the Model Data) of the existing systems in advance before the System Analyst started building any new system.
Following steps and role are played during the system development cycle:
Time | Procedure | Role |
---|---|---|
Month 1 | Reality Observation | Salesperson End User Business Analyst Business Owner |
Month 2 | Modeling | Business Analyst System Analyst |
Month 3 | System Building Execution | System Analyst |
Consider the following scenario and timeline
Reality Observation
Salutation
field in the Client
Form.Modeling
System Building Exeuction
Salutation
Field into them.If you are detailed-mind enough, you may realize that , as a layman without any system analysis training background, the End User , Salesperson and Business Owner cannot technically and precisely turn their comment via Entity-Relationship Diagram friendly syntax to communicate with the Business Analyst.
Imagine what if all these 4 different parties involved in the communication chains are communicating by different language and wordings, how the fidelity of the reality has been deteriorated , and how much time is wasted on the redundant communication edge. (i.e. A to B , B to C , and C to D)
This means if the comment feedback from End User is not at the same time as the System Analyst to do the coding workload for updating the system , then this piece of comment from End User (i.e. comment on adding a new Salutation
Field) should be recorded in somewhere and easily be found by the System Analyst in their system update request job queue.
The Model Data can act as a communication protocol during the whole system development cycle.
In a traditional way of building a CRM system, the Object Client
may probably be described in the Entity-Relationship Diagrams in different systems as below:
Column Name |
---|
First Name |
Last Name |
Salutation |
Email Address |
When time goes by, another eDM System is introduced into the company with another Client Object inside the system as below:
Column Name |
---|
Given Name |
Family Name |
Salutation |
Gender |
Due to the fact that there are 2 separate systems, you cannot link up the 2 Client Tables of 2 systems in 1 Entity-Relationship Diagram. In fact, you have to draw 2 Entity-Relationship Diagrams, one system per Entity-Relationship Diagram.
This practice makes us can never realizes that in fact these 2 Client Objects in 2 separated systems are actually referring to same concept (i.e. Client) in the reality
Besides, if you find the field Gender
is a valid and useful information in eDM, the system analyst may not address that this Gender
field should also be added into the Client Table in the CRM system.
On the contrary , if we demonstrate the Model Data via Graph created by Graph Database, we can enjoy the following benefit:
Client
from CRM and eDM systems can be consolidated and presented in 1 Node. This consolidation can easily be figured out by the Node Labels (i.e. the wording eDM and CRM outside the Node circle)Given Name
in eDM is in fact identical to the First Name
in CRM by reading the Relationship Type (i.e. IDENTICAL_TO
)Gender
is a useful information which should define (and have defined) in eDM, it makes sense to infer that this Gender
property should also be defined in CRM. By using Graph Data Science AI tools, this kind of insight (we called it Label Prediction) can easily be achieved. While the AI algo is out of the scope of this article, we will discuss it somewhere in the future.Although modeling data with a Graph Database provides greater fidelity, the Graph Database in itself is not good for data input.
While in our daily life most of the Input Form and Report are linked to underlying Tables in a Relational Database, it is hard for us to build the Input Form and Report directly on top of a Graph Database. Although it is technically feasible to do so, due to the compatibility with other existing systems, as well as the human user behavior in both inputting and consuming data, the technical stacks will strike a balance between user experience and Model Fidelity.
In this sense, we decide to keep the Relational Database as an “Abstract Layer” between the Frontend Application (during Input) by both human / machine (i.e. API) users and the Graph Database.
The toughest trade-off of this method is that we need to periodically synchronise , either in mutual (2-way) or in 1-way , the data between the Relational Database and Graph Database, although it’s still manageable .
Create a Table name “bNode
” in Relational Database to store (as records in a Table) all the Nodes.
This Node
Table can also serve as a LookUp Data List as if traditional Relational Database does. For example, as the Option List of Gender
{Male | Female | Trangender | Unisex} will always be the same no matter under which system or knowledge domain, there is no need to define the Option List of the property Gender for each system (or each instance of a system type). This Option List of property Gender
can be looked up by different systems via their Gender Field, this is nothing about Graph Database which already serves the purpose perfectly.
Create a Table name bRelationship
in Relational Database to store (as records in a Table) all the relationships among the Node.
Example Record in a Relationship Table:
Source Node | Relationship Type | Target Node |
---|---|---|
Client | HAS_ONE | First Name |
Client | HAS_MANY | Email Address |
First Name | IDENTICAL_TO | Given Name |
While in article Build a Business Process Management System – Stage of System Building we have defined that the 1st stage of building a system is Modeling, in article Build a Business Process Management System – BFs-WAITER Pivot Table we have further named the content or directions that we should included in the Modeling stage as BFs-WAITER.
No matter how comprehensive the model can reflect the intricacies of the real world, we should have a tool to effectively transform the model to an executable system with a human interface, namely bGraph in Diamond Digital Marketing Group. Before we dive into the functionality of the bGraph, in order to sharpen the effectiveness of the it, it is always a good practice to enumerate the problem patterns that we encountered when using traditional tools.
In the very beginning of a Modeling Stage, the Business Analyst (or Consultant, whatever you name it) will conduct an interview to the stakeholders of the target company in order to collect the information relating to the target business process. Any kind of documentation collection, verbal description, or even on front line field observation is carried out by the Business Analyst to be familiar with the target business process.
After the Business Analyst finished the interview, he/she should spend time on organising the data into information and pass it to the System Analyst (and his/her programmer team) and bring the BPM System Building stage to Stage 2 (Standardization).
However, different target businesses, different Business Analysts or different clients, will always use different wording or language to describe the same concept. For example, while the client will refer to the product they are selling as Product, Business Analyst will name the Product as SKU. Another example is that the wording Last Name is a synonym of Surname, which can be used interchangeably.
On the contrary, Business Analyst and Client will use the same wording to refer to different business concepts. One of a typical example is the term “Client“. In a manufacturing industrial chain, no matter if you are the Manufacturer, Distributor or the Retailer ,you will always name your downstream as “Client”. During a BPM System Interview, a Business Analyst needs to pay double attention to figure out who (Manufacturer, Distributor or Retailer) the term “Client” is referred to. As a professional Business Analyst, we will name them as the Brand, the Merchant , the Retailer and the End User respectively in order to uniquely identify them.
This polymorphism in communication not only occurs between the Client and Business Analyst, but also the Business Analyst and the Programmers. The more different wordings are used , the more resistance will be derived during the communication.
Therefore , a communication protocol which can synchronise the wording is necessary.
No matter which industry , country or business model the client is in, a CRM system will always share some common properties and features.
For example, the client will expect a standard CRM to have a Contact
module which at least has First Name
and Last Name
as the properties of the object Contact
.
As a Business Analyst , in a BPM System Interview, you may not want to waste both you and your client’s time to go through what common properties a CRM System should have, which those common properties may properly went through many times in the previous similar project.
On top of it, it is a must for a CRM to have a Country
field for the users to fill in the nationality of the client. As a Business Analyst, you may not want to go through the comprehensive list of countries again and again in different projects.
In this sense, it will be a great time saver if we can have a CRM System Building Template which comprises all the common properties of a standard CRM.
Even though you (as a Business Analyst) are driven by public-spiritedness that you already encapsulated a comprehensive Country list as an array for next project use, how can the other Business Analyst , or even the future yourself, remember or realize that you have already created the Country List before?
Even worse, the concept Country
can and will occur not only in CRM , but also almost any kinds of system like Project Management System, Eshop or Booking system. What will make (i.e. trigger) the programmer who is going to build a Booking system think that he can refer to the previously built CRM system to find out the Country List? If he/she cannot realize that a Country list already existed in some other project blueprint , he/she may probably will spend time to do it again, what duplicated the cost of development.
If the next Business Analyst does not realize that you have already done this before, he will not search for the Country List. There is always a gap between searching for the solution and the solution itself.
Although there are many properties in common in a CRM system, there are different properties too. For example, while a trading company may expect a Contact
is defined as a Company or Organisation which should have Company Name
field, a Beauty salon may expect all their Contact
is an individual which should have First Name
and Last Name
field.
It is necessary for us (Diamond Digital Marketing Group) to have a system which stores all the common and differences of building different systems for different clients.
To continue the example of CRM system building, as a Business Analyst, even though you have carefully listened to your client and clearly defined the common and different fields of the target CRM system after the 1st interview, it is very unlikely that you can hit a home run to gather 100% of the expected features and properties of the target CRM system in the 1st interview. While a system building is a lengthy project which always lasts for months or even years, the business environment is probably changed from time to time during the target CRM system building period, which will also affect the features and properties of the target CRM system.
Imagine a scenario as below:
In Day 1 the Business Analyst suggested the field First Name
and Last Name
to be included in the Contact
Module of the target CRM system. In the very next Day 2 , the programmers have already kicked off the program coding workload and created a Table in the Database , as well as the First Name
and Last Name
Field in the user interface.
However, in Day 3 due to a new Marketing Manager (a “she”) on board from the client side, she perceived that the fields Maiden Name
and Middle Name
are common sense and should also be added into the Contact
Module. While she passed this request to our Business Analyst, and then our Business Analyst passed this request to the programmers on Day 5 by directly appending 2 New Columns Maiden Name and Middle Name in the CRM system Building Blueprint Spreadsheet.
This behavior will make the programmers confused because (if you have paid attention to our story) the programmers had already completed the program coding workload in Day 2, how can they realize that 2 new columns are appended to the CRM system blueprint spreadsheet which they had just brought to coding?
Even though you may suggest that the Business Analyst can notice the programmers after they had done any adjustment in the blueprint spreadsheet, due to the fact that the specification of the blueprint is in fact under a streaming status which can be and will be changed from time to time, it will be impossible for the programmers to build the system based on a ever changing blueprint. Do you expect the programmers to click if there is any modification in the blueprint spreadsheet every 1 hour?
In this sense, a streaming oriented system blueprint is necessary for the communication between the Business Analyst and the Programmers, instead of a traditional system building blueprint which only reflects an instantaneous time spot.
This streaming oriented communication mechanism not only satisfies the modification need during the development status, under a DevOps concept, but also in the future after the system is brought to production due to the fact that the system is a living organism which is dynamic to the ever changing business environment. The traditional Batch (or Versioning) oriented can not satisfy in this sense.
As an experienced Business Analyst , you can imagine that no matter how you ask your client to submit an expected new field or new feature of a system via a submission form, the client may probably not follow your instruction and simply send that expected new field to you via email or even WhatsApp.
After you receive the request from the client, instead of only simply forwarding the request to the programmer to handle, as a responsible and professional Business Analyst , it is our duty to validate whether or not the new request is a valid request (most of the time the request is invalid).
For example, if the Client complains in the Contact module of a CRM that the field Sex
is missing in the Form in the user interface, you should first of all go to the project blueprint to check whether the field Sex
should be included in the blueprint. If the Sex
field can be found in the blueprint but not in the Form
in the user interface, then you should contact the programmer to fix it up. But in the real world, most of the time after you conduct the checking, you will realize that the field Sex
is in fact named as Gender
in the Form
in the user interface of the Contact
module.
Can you imagine this kind of back and forth checking and non productive communication is the main cause of eroding the time on production.
Think about if you are handling 10 BPM system building projects on hand , how can you quickly open a system (if there is any!) in your mobile device to check whether the complaint from one client is valid or not? If the complaint is valid , how can you quickly send an instruction to the programmer to fix the bug, provided that you are not seated in front of the desktop but instead on the way travelling to the next client meeting?
If you find the complaint is valid and is a critical path of the project which if you don’t fix up the bug immediately the error will cascade to the next node of the critical path of the project which in turn derives an irreversible catastro, you cannot afford to notify the programmer after you finished the meeting.
A powerful steaming BPM building system is necessary for catering all the mobility needs of the communication.
A Neural Network Model, also known as an artificial neural network (ANN), is a type of machine learning model inspired by the structure and function of the human brain.
While this model is applied in the Marketing domain, it becomes the Marketing Neural Networking Model.
Instead of diving into the intricacy of the mathematical formula and operation, we instead will put the spotlight on the semantic logic behind the calculation.
In a nutshell, while Marketing Consultant is mainly to providing Marketing Strategy, a Marketing Strategy is simply making a series of decisions on how to choose among alternatives. For example, if you want to sell a Tattoo Printer to teenages, will you use Facebook or Instagram to promote your product?
To choose between “Facebook” or “Instagram” (i.e. 2 alternatives) is called Marketing Strategy. For sure, in reality, it always takes more than 1 factor (or attribute) to make a decision, and takes more than 1 decisions to formuate a strategy . You can imagine it’s in fact a dynamic decision chain in which the outcome of 1 decision will affect not only the outcome, but also even the option values (i.e. all alternatives) of the decision.
The Marketing Neural Networking Model is purposed to learn and solve how to make decisions in a scientific way.
Only after we turn the decision making process in a scientific way can we automate the decision making process via A.I. by applying the Marketing Neural Networking Model, which in turn creates an A.I. Marketing Consultant.
Although the intricacy of the Neural Networking Model is a bit scary, decoupling it in piecemeal and demonstrating with a story, will definitely aid you to comprehend the concept more efficiently. Bear in mind that it is obviously a simplified example which in reality will be 1000 times in scale.
Before starting the story, allow us to provide you the legend of the Figure (Marketing Neural Networking Model) above:
Rectangle ( ▭ ) : The Attribute (or Property, or Layer) of the Object, which the Object is the Marketing Neural Network Model.
Circle (○) : Nodes (i.e. any Business Concepts)
Sold Line ( ⎯⎯ ) : Positive Edges which has directionaly relationship between 2 Nodes
Dot Line (···) : Negative Edges which has NO directional relationship between 2 Nodes
Imagine you are the CEO of a conglomerate which at the same time run a Fashion Retail Store as well as a Diamond Wholesaler business. You are required by your shareholders to incrementally increase the ROI of the conglomerate by 10X, which is quite an impossible mission. In order to achieve this goal, you start by enumerating all the “Concepts” (i.e. the Node) in your mind which related to the business as below:
In reality, the process of addressing , enumerating and filtering all the Concepts (i.e. the Nodes) relating to the business is almost an impossible task for human beings. The more knowledge Nodes the marketer acquired and manipulated, the more professional he is.
Back to our story, immedate after you enumerated all the Nodes in your mind which you think are related to your business, you addressed some pattern that there are some patterns within these Nodes:
Having played around with the interface of the Google Merchant Center for a day, you realized that Google Merchant Center is mainly designed for listing the products in the storefront of Google Shopping Tab in retail price, and therefore the Google Merchant Center is better to apply in any retail instead of wholesale business because there is no any field for the Google Merchant Center to insert any tiered pricing or bulk discount in the storefront. In this sense, you addressed that what Digital Assets (Attribute) you are uisng will be dependent to the Business Model (Attribute 1). Therefore you deduce your own business rule (which is called business intelligent in the business world) as below:
Business Rules 1 : Digital Assets is dependent to the Business Model
By applying Business Rules 1 in your business, you decide to adapt the Google Merchant Center into your Fashion Retail Store (Edge 2) and meanwhile NOT adapt in your Diamond Wholesaler business (Edge 5)
While having 10 years experience on using Linkedin Business Page, you understand that the users who are responsive in Linkedin are mainly seeking for business opportunities (i.e. B2B) instead of retail purchasing (i.e. B2C). Although you have this “insight”, you still from time to time scrolled to some Feeds in Linkedin which are selling to retail customers. As you cannot 100% sure about your insight, and therefore you classify it as a Correlation Coefficient (denotes “r”) relationship which the Correlation Coefficient of the responsiveness between Linkedin Business Page and Retail Business is low (e.g. r=0.3) , and meanwhile it is high (e.g. r=0.9) between Linked Business Page and Wholesale business.
In this stage, you can bypass the understanding of the mathematical operation of the Correlation Coefficient. What you need to know is simply that the higher the value of the Correlation Coefficient (r) , the closer the relationship to (Positive) Causal Relationship.
Now based on the Correlation Coefficient which is conducted by your empirical study, you deduce other Business Rule as below:
Business Rule 2 : The responsiveness of the Linkedin Business Page is high for Wholesale Business and low for Retail Business.
By applying Business Rules 2 in your business, you decide to adapt the Linkedin Business Page into your Diamond Wholesaler Store (Edge 6) and meanwhile NOT adapt in your Fashion Retail Store business (Edge 3)
By continuing deducing the Business Rules based on your experience or any other statistic, you figured out the following Business Rules for the Edge as below:
Decision# | Involved Edge | Business Rules |
---|---|---|
Edge #1 and #7 | Fashion Retail Store > Website > Ads | Fashion Retail Store needs Website as the landing page of placing Ads. |
Edge #1 and #8 | Fashion Retail Store > Website > Payment Gateway | Fashion Retail Store needs Payment Gateway to install in Website to receive payment from Client |
Edge #1 and #9 | Fashion Retail Store > Website > Feed | Fashion Retail Store needs put the Feed in the Website for content marketing articles publishing |
Edge #1 and #10 | Fashion Retail Store > Website > Enquiry Form | Fashion Retail Store needs put the Enquiry Form in the Website for replying questions from client. |
Edge #2 and #11 | Fashion Retail Store > Google Merchant Center > Ads | Fashion Retail Store needs Google Merchant Center showcasing their product in Google Ads Campaign |
Edge #2 and #12 | Fashion Retail Store > Google Merchant Center > Payment Gateway | Google Merchant Center does not support Payment Gateway |
Edge #2 and #13 | Fashion Retail Store > Google Merchant Center > Feed | Fashion Retail Store needs to turn the Product Page of the website to Google Merchant Center’s Feed |
Edge #2 and #14 | Fashion Retail Store > Google Merchant Center > Enquiry Form | Fashion Retail Store does not support Enquiry Form Function |
Edge #3 and #15 | Fashion Retail Store > Linkedin Business Page > Ads | Ads placed in Linkedin Business Page is not appropriate for Fashion Retail Store |
Edge #3 and #16 | Fashion Retail Store > Linkedin Business Page > Payment Gateway | Linkedin Business Page does not support Payment Gateway |
Edge #3 and #17 | Fashion Retail Store > Linkedin Business Page > Feed | Audience of Linkedin Business Page is not expected Retail Feed from Fashion Retail Store showing in their Linkedin Personal account. |
Edge #3 and #18 | Fashion Retail Store > Linkedin Business Page > Enquiry Form | There is no Enquiry Form function in Linkedin Business Page |
Edge #4 and #7 | Diamond Wholesaler > Website > Ads | Diamond Wholesaler needs Website as the landing page of placing Ads. |
Edge #4 and #8 | Diamond Wholesaler > Website > Payment Gateway | Diamond Wholesaler does not expect the client to place order in the Website directly. Therefore Payment Gateway is not needed. |
Edge #4 and #9 | Diamond Wholesaler > Website > Feed | Diamond Wholesaler needs put the Feed in the Website for content marketing articles publishing |
Edge #4 and #10 | Diamond Wholesaler > Website > Enquiry Form | Diamond Wholesaler definitely needs Enquiry Form in the Website as the client will ask for product info and transactional info before placing order. |
Edge #5 and #11 | Diamond Wholesaler > Google Merchant Center > Ads | Diamond Wholesaler may not need to place the Ads via Google Merchant Center Campaign because Google Merchant Center do not support tiered-pricing or quantity pricing function. |
Edge #5 and #12 | Diamond Wholesaler > Google Merchant Center > Payment Gateway | Google Merchant Center does not support Payment Gateway |
Edge #5 and #13 | Diamond Wholesaler > Google Merchant Center > Feed | Diamond Wholesaler may not need to sync the Product Feed from their website to Google Merchant Center because Google Merchant Center do not support tiered-pricing or quantity pricing function. |
Edge #5 and #14 | Diamond Wholesaler > Google Merchant Center > Enquiry Form | There is no Enquiry Form function in Google Merchant Center. |
Edge #6 and #15 | Diamond Wholesaler > Linkedin Business Page > Ads | Diamond Wholesaler is appropriate to place Ads in Linkedin Business Page to seek for the management level Decision Maker or Merchandiser based on the Job Title Ads segmentation. |
Edge #6 and #16 | Diamond Wholesaler > Linkedin Business Page > Payment Gateway | Linkedin Business Page does not support Payment Gateway |
Edge #6 and #17 | Diamond Wholesaler > Linkedin Business Page > Feed | Diamond Wholesaler is appropriate to publish Feed in Linkedin Business Page to seek for the management level Decision Maker or Merchandiser. |
Edge #6 and #18 | Diamond Wholesaler > Linkedin Business Page > Enquiry Form | There is no Enquiry Form function in Linkedin Business Page |
The reason why we need to enumerate all the possbile decision combinations is that while Strategy means “decision“, to formulate a Marketing Strategy, covering all possible decisions comprehensively is as important as figuring out the appropriate answer of a single decision.
The only way to enumerate 100% of the decision combinations is by enumerating all the Attributes and all Option Values of each Attributes, and multiplying them all together to become an Cartersian Product. In turn, there will be no decision combination missing out within the Model (i.e. figured out exactly ALL possibilties within the Model, no more and no less) , provided that there are no relevant attributes in the Marketing Neural Networking Model that are missing out, which we will discuss this “bug” in upcoming chapter.
Remember in the old days (or even today without A.I) you learn digital marketing strategies by listening from the advice provided by the senior digital marketing consultant to the client. Every time when you were participating in a client meeting, you were impressed by how deep the knowledge ocean that the senior digital marketing consultant acquired that seemed he could non stop sharing his knowledge forever. You dropped down every single piece of know-how into a notebook and dreamed of that you might become him some day when you acquired ALL his knowledge, although you never know how “exact quantity” of “ALL” knowledge is.
Even if luckily , you did the miracle and learned “all” the knowledge and become another iconic senior digital marketing consultant, your next generation will encounter the same problem as you did, which he/she needs to take notes and learn piece by piece starting from a blank paper.
This inefficient resistant makes the knowledge transmission process extremely slow, just like what human beings did in the passed 7,000 years since mankind’s history.
Bear in mind that the example that we made previously in this session only describes 24 decision combinations , which accounts for a extremely tiny portion of reality which probably has 10 of millions of decision combinations, which is far beyond the processing power of a mortal within his lifespan.
In order to have a systematic way to record all the Knowledge Nodes and the relationships amongst the Nodes, the Neural Networking Model is a perfect candidate to provide a paradigm which turn reality into a conceptualised mathematical model to do the job , not only by human beings but also by computer, which it’s compute power can dramatically speed up the pace of learning by decade of years, and letting processing ALL decision combinations to be an mission possible.
Human learning is a complex and on-going process which describes the interaction between the human being and the environement surrounded them, and how they interpret the data and formulate the model to project the world. While it’s worth a whole book to explain it, in this article we only extract the part which related to the Data Structure.
First of all, Data is nothing about computers or digital. long before the invention of computer or any digital devices, data exists.
Allow me to explain Data with an example. Some day 5,000 years ago in Mesopotamiaⓘ, a Sumerians named Adamen brought a sheep to the market for sale. While he stood in the street for almost 6 hours, finally he found a richman who was really going to buy his sheep for 50 Shekelsⓘ. He was happy and thought that if he could sell all the sheeps he possessed , which was 10 sheeps , he could have financial freedom. So he left the market and thought of how to execute his plan.
Immediately after he arrived home, he found it’s really hard for him to bring 10 sheeps from his home to the market. He was thinking that instead of bringing the entire sheeps to the market, is there any way that he can only bring part of the sheep? In turn, he cut off one nail from each of the sheep, and brought these 10 nails to the market to make people believe that he possessed 10 sheeps.
In this story, the nail of the sheep is acting as a Data to denote the underlying material object – the sheep.
You may wonder why he doesn’t simply use a paper and write the word “sheep” on it. Please bear in mind that paper and words were not invented at that time.
Of course, when time goes by, when the word and paper were invented, people like Adamen can simply use a paper to write down the wording “Sheep” to denote the underlying material object “Sheep”. No matter how , the function of Data, to point a word (or symbol , or glyphics, or character, or sound, or pronunciation, you name it.) to an underlying material object, is always the same.
That’s the beginning of the story of Data.
A data structure is a concept for running a database. Data structure is a specialised format for organising, processing, retrieving, and storing data. It defines how data is arranged in a computer so that it can be accessed and updated efficiently. There are mainly 2 types of Data Structures:
In common English for easy understanding, you can regard Relational Data Structure as a 2-Dimension table which use both Column and Row to co-ordinate a Value (i.e. we call it “Cell” in MS Excel or Google Spreadsheet). It mainly focus on the relationship between the attribute (i.e. the Column Name and Born) and the attaching object (i.e. the Table Ancient Celebrities) itself.
Example of a Relational Data Structure (i.e. a Table)
Ancient Celebrities # | Name | Born in | Job Title |
---|---|---|---|
201 | Plato | B.C 429 | Philosopher |
202 | Aristotle | B.C 322 | Philosopher & Mathematician |
203 | Alexander the Great | B.C 356 | King of Macedonia |
In common English, you can regard Non-relational Data Structure as a tree (or hierarchical) list which uses Node and Edge to coordinate the Value. Unlike Relational Data Structure which focus on the relationship between the attribute and it’s attaching object, Non-relational Data Structure focus on the relationship (i.e. the Edge) between Object (i.e. the Node) and another Object (i.e. another Node)
Example of a Non-relational Data Structure (i.e. a Tree List)
- Plato (Node 1)
- Aristotle (Node 2)
- Alexander the Great (Node 3)
whereas , there are 3 Nodes in the Tree List. Although it is tempting to think that there only 2 relationships (Edges) between the 3 Nodes, in fact there are 4 relationships (Edges) in among:
4 Edges instead of 2 Edges because the direction of the relationship (Edge) does matter.
Let’s start this topic with a question asked from your friend:
Hey, who is Aristotle?
To answer this question, you may reply him in English as below:
Aristotle is ancient philosopher and mathematician who was born in B.C 322 , whom is the student of Plato as well as the teacher of Alexander the Great.
While the answer above is exactly same as what we will speak in daily English, this sentence is informative enough for anyone to have a brief understanding on who Artistotle is. Nevertheless, even though you are very good in English, compared with the time spent on reading the Table and Tree List , you may spend more time to read through the English sentence word by word.
On the contrary, while you are reading the sentence, in fact what you do to comprehend the sentence is by idetntifing the attributes of Aristotle (e.g. Born in , Job Title) , as well as the hierarchical relationship (i.e. Edges) between Plato (Node 1) and Alexander the Great (Node 3).
By presenting in Table and Tree List format, only with few hours of practice, anyone can comprehend any articles much faster than simply reading in plain English format.
However, the story of human learning does not end just like this. Back to our example, while your friend seriously listened to your reply, although he realised that Aristotle is the student of Plato, you could never imagine he didn’t know the meaning of “B.C.” and he asked you about what is “B.C.”.
“B.C.” is an acronym of “Before Christ”. It is a dating system which is used to denote any year before the birth of Christ. The opposite of “B.C.” is “A.D.”, which stands for “Anno Domini”, which is a Latin phrase meaning “In the Year of Our Lord”. The year of 2024 means we are in A.D. 2024, which we normally will skip the terms “A.D.” as it is by default.
Having Replied by you, now your friend knew the new knowledge regarding the dating system B.C. and A.D. you can simply turn the plain English into the Table and Tree List format as if we have done before as below:
Acronym | Word Stem | Language | Presenting Year |
---|---|---|---|
B.C | Before Christ | English | Before Christ Born at Year 0 |
A.D | Anno Domini | Latin | After Christ Born at Year 0 |
- Dating System
- B.C.
- A.D.
In fact, every single concept (I called it Knowledge Node) will always have its own attributes as well as the relationships (i.e. Edges) between other Nodes.
Imagine if your friend is a 5-year old boy and he knows very little about what you said (and even about this world!) and he is going to ask you almost every single word in your sentense like this:
If you turn all these 11 concepts (i.e. Knowledge Nodes) into Table and Tree List format, you can imagine the Data Structure will resemble the image below:
This is a typical Adaptive Search pattern which someone need to “search for what he wants to search for“, and in turn forming a Knowledge Graph which a smart person like you will quickly realise that you can (or need) to add an infinite number of Nodes and Edges inside the Graph in order to learn something. The more Nodes you add into the diagram, the more attributes will be derived. And each attribute of a Node can become a new Node.
And that’s exactly how the data structure behaves during human learning.
Remember the previous example when you explain to your friend who Aristotle is. In order to make him understand who is Aristotle, he need to acquire the foundation knowledge which made him diving into 4 level of Nodes as below:
You can now sense the challenge of how a human being learns a new concept which he will get lost in the maze very soon after he has no idea how many levels he should dive into in order to comprehensively understand a single concept in a topic (i.e. a Knowledge Domain). And the Knowledge statedion in your brain will finally distribute in this way:
Nevertheless, don’t be upset by the truth and we should (and already have) found a “Map” to navigate us in this knowledge maze.
Finally , let’s back to Aristotle again and end this topic by an citation from him which describes the problem being suffered during human learning:
The More You Know , The More You Realize You Don’t Know
Introduction In the earlier article titled Data Schema of Relational Table Importing to Graph Database – Dimensionality, we explored how…
Definition Relational Table The Table is the fundamental components inside a Relational Database which stores data in tabular format using…
Introduction bGraph is an SaaS developed in-house by Diamond Digital Marketing Group which can be categorised as a GraphRAG web…