PG formats¶
PG format¶
- A PG file consists of lines that describe nodes and edges
- Each line describes one node or one edge
example.pg
# NODES
101 :person name:Alice country:"United States"
102 :person :student name:Bob country:Japan
# EDGES
101 -- 102 :same_school :same_class since:2012
101 -> 102 :likes since:2015
Nodes¶
<node_id> :<label1> :<label2> ... <key1>:<value1> <key2>:<value2> ...
All elements are separated by space or tab
Node IDs have to be unique
- If there are multiple lines with the same Node ID, latter ones are ignored
Each line can contain arbitrarily many labels
Each line can contain arbitrarily many properties
Each property can have multiple values.
The following example has multiple values as
name
property101 :person name:Alice name:Ally country:"United States"
Edges¶
<src_node_id> [->|--] <dst_node_id> :<label1> :<label2> ... <key1>:<value1> <key2>:<value2> ...
- Basically, edge lines have the same format as node lines
- However, the first three columns contain source node ID, direction, and destination node ID
- An edge can be directed
->
or undirected--
- The combinations of node IDs do NOT have to be unique. (= multiple edges are allowed)
- An edge line will be ignored if a non-defined node ID is used
Data type¶
PG format allows the following data types:
- Integer: Written as a sequence of digits
- For example,
1
,009
and301
- For example,
- Double-precision floating-point number (double): Written as a sequence of digits with exact one period
- For example,
1.0
,2.321
and001.002
- For example,
- String: Anything else
- Should be double quoted if it contains a space, tab, or colon (
:
) - To escape double quotes, use
\"
- For example,
Alice
,x2
,"2.00"
,"United States"
and"\"Quoted String\""
- Should be double quoted if it contains a space, tab, or colon (
Each element can have one of the following data types:
- Node ID: integer or string
- Label: string
- Property key: string
- Property value: integer, double, or string
JSON-PG format¶
JSON format is useful for being processed by web clients, while the PG (flat file) format above is convenient for users and file systems.
This format basically follows the rules of the general JSON format and our PG format, however:
- Nodes and edges are listed under
nodes
andedges
elements, respectively - Edge direction is defined with the boolean element
undirected
. By default it isfalse
(= directed) - Labels are listed under the
labels
element - Properties (= key-value pairs) are listed under the
properties
element
example.json
{
"nodes":[
{"id":101, "labels":["person"], "properties":{"name":["Alice"], "country":["United States"]}}
, {"id":102, "labels":["person", "student"], "properties":{"name":["Bob"], "country":["Japan"]}}
],
"edges":[
{"from":101, "to":102, "undirected":true, "labels":["same_school", "same_class"], "properties":{"since":[2012]}}
, {"from":101, "to":102, "labels":["likes"], "properties":{"since":[2015]}}
]
}
Comparing the formats¶
PG
# NODES
101 :person name:Alice country:"United States"
102 :person :student name:Bob country:Japan
# EDGES
101 -- 102 :same_school :same_class since:2012
101 -> 102 :likes since:2015
JSON-PG
{
"nodes":[
{"id":101, "labels":["person"], "properties":{"name":["Alice"], "country":["United States"]}}
, {"id":102, "labels":["person", "student"], "properties":{"name":["Bob"], "country":["Japan"]}}
],
"edges":[
{"from":101, "to":102, "undirected":true, "labels":["same_school", "same_class"], "properties":{"since":[2012]}}
, {"from":101, "to":102, "labels":["likes"], "properties":{"since":[2015]}}
]
}