Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/api-reference/colors.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
.. autoenum:: neo4j_viz.colors.ColorSpace
:members:
49 changes: 26 additions & 23 deletions docs/source/customizing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,24 +23,26 @@ If you have not yet created a ``VisualizationGraph`` object, please refer to one
Coloring nodes
--------------

Nodes can be colored directly by providing them with a color property, upon creation.
Nodes can be colored directly by providing them with a color field, upon creation.
This can for example be done by passing a color as a string to the ``color`` parameter of the
:doc:`Node <./api-reference/node>` object.

Alternatively, you can color nodes based on a property (field) of the nodes after a ``VisualizationGraph`` object has been
Alternatively, you can color nodes based on a field or property of the nodes after a ``VisualizationGraph`` object has been
created.


The ``color_nodes`` method
~~~~~~~~~~~~~~~~~~~~~~~~~~

By calling the :meth:`neo4j_viz.VisualizationGraph.color_nodes` method, you can color nodes based on a
node property (field).
It's possible to color the nodes based on a discrete or continuous property.
In the discrete case, a new color from the ``colors`` provided is assigned to each unique value of the node property.
In the continuous case, the ``colors`` should be a list of colors representing a range that are used to create a gradient of colors based on the values of the node property.
node field or property (members of the `Node.properties` map).

By default the Neo4j color palette that works for both light and dark mode will be used.
It's possible to color the nodes based on a discrete or continuous color space (see :doc:`ColorSpace <./api-reference/colors>`).
In the discrete case, a new color from the `colors` provided is assigned to each unique value of the node field/property.
In the continuous case, the `colors` should be a list of colors representing a range that are used to
create a gradient of colors based on the values of the node field/property.

By default the Neo4j color palette, that works for both light and dark mode, will be used.
If you want to use a different color palette, you can pass a dictionary or iterable of colors as the ``colors``
parameter.
A color value can for example be either strings like "blue", or hexadecimal color codes like "#FF0000", or even a tuple of RGB values like (255, 0, 255).
Expand All @@ -49,20 +51,20 @@ If some nodes already have a ``color`` set, you can choose whether or not to ove
parameter.


By discrete node property (field)
*********************************
By discrete color space
***********************

To not use the default colors, we can provide a list of custom colors based on the discrete node property (field) "caption" to the ``color_nodes`` method:
To not use the default colors, we can provide a list of custom colors based on the discrete node field "caption" to the ``color_nodes`` method:

.. code-block:: python

from neo4j_viz.colors import PropertyType
from neo4j_viz.colors import ColorSpace

# VG is a VisualizationGraph object
VG.color_nodes(
"caption",
field="caption",
["red", "#7fffd4", (255, 255, 255, 0.5), "hsl(270, 60%, 70%)"],
property_type=PropertyType.DISCRETE
color_space=ColorSpace.DISCRETE
)

The full set of allowed values for colors are listed `here <https://docs.pydantic.dev/2.0/usage/types/extra_types/color_types/>`_.
Expand All @@ -75,18 +77,18 @@ this snippet:
from palettable.wesanderson import Moonrise1_5

# VG is a VisualizationGraph object
VG.color_nodes("caption", Moonrise1_5.colors) # PropertyType.DISCRETE is default
VG.color_nodes(field="caption", Moonrise1_5.colors) # PropertyType.DISCRETE is default

In this case, all nodes with the same caption will get the same color.
In theses cases, all nodes with the same caption will get the same color.

If there are fewer colors that unique values for the node ``property`` provided, the colors will be reused in a cycle.
To avoid that, you could use another palette or extend one with additional colors. Please refer to the
If there are fewer colors than unique values for the node ``field`` or ``property`` provided, the colors will be reused in a cycle.
To avoid that, you could use a larger palette or extend one with additional colors. Please refer to the
:doc:`Visualizing Neo4j Graph Data Science (GDS) Graphs tutorial <./tutorials/gds-example>` for an example on how
to do the latter.


By continuous node property (field)
***********************************
By continuous color space
*************************

To not use the default colors, we can provide a list of custom colors representing a range to the ``color_nodes`` method:

Expand All @@ -96,9 +98,9 @@ To not use the default colors, we can provide a list of custom colors representi

# VG is a VisualizationGraph object
VG.color_nodes(
"centrality_score",
property="centrality_score",
[(255, 0, 0), (191, 64, 0), (128, 128, 0), (64, 191, 0), (0, 255, 0)] # From red to green
property_type=PropertyType.CONTINUOUS
color_space=ColorSpace.CONTINUOUS
)

In this case, the nodes will be colored based on the value of the "centrality_score" property, with the lowest values being colored red and the highest values being colored green.
Expand All @@ -110,7 +112,7 @@ Since we only provided five colors in the range, the granularity of the gradient
Sizing nodes
------------

Nodes can be given a size directly by providing them with a size property, upon creation.
Nodes can be given a size directly by providing them with a size field, upon creation.
This can for example be done by passing a size as an integer to the ``size`` parameter of the
:doc:`Node <./api-reference/node>` object.

Expand Down Expand Up @@ -178,7 +180,7 @@ In the following example, we pin the node with ID 1337 and unpin the node with I
Direct modification of nodes and relationships
----------------------------------------------

Nodes and relationships can also be modified directly by accessing the ``nodes`` and ``relationships`` attributes of an
Nodes and relationships can also be modified directly by accessing the ``nodes`` and ``relationships`` fields of an
existing ``VisualizationGraph`` object.
These attributes list of all the :doc:`Nodes <./api-reference/node>` and
:doc:`Relationships <./api-reference/relationship>` in the graph, respectively.
Expand All @@ -189,6 +191,7 @@ Each node and relationship has attributes that can be accessed and modified dire

# VG is a VisualizationGraph object
VG.nodes[0].size = 10
VG.nodes[0].properties["height"] = 170
VG.relationships[4].caption = "BUYS"

Any changes made to the nodes and relationships will be reflected in the next rendering of the graph.
32 changes: 20 additions & 12 deletions docs/source/integration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@ Integration with other libraries

In addition to creating graphs from scratch, with ``neo4j-viz`` as is shown in the
:doc:`Getting started section <./getting-started>`, you can also import data directly from external sources.
In this section we will cover how to import data from `Pandas DataFrames <https://pandas.pydata.org/>`_ and
`Neo4j Graph Data Science <https://neo4j.com/docs/graph-data-science/current/>`_.
In this section we will cover how to import data from `Pandas DataFrames <https://pandas.pydata.org/>`_,
`Neo4j Graph Data Science <https://neo4j.com/docs/graph-data-science/current/>`_ and
`Neo4j Database <https://neo4j.com/docs/python-manual/current/>`_.


.. contents:: On this page:
Expand All @@ -31,12 +32,18 @@ The ``from_dfs`` method takes two mandatory positional parameters:

* A Pandas ``DataFrame``, or iterable (eg. list) of DataFrames representing the nodes of the graph.
The rows of the DataFrame(s) should represent the individual nodes, and the columns should represent the node
IDs and properties. The columns map directly to fields of :doc:`Node <./api-reference/node>`, and as such
should follow the same naming conventions.
IDs and attributes.
If a column shares the name with a field of :doc:`Node <./api-reference/node>`, the values it contains will be set
on corresponding nodes under that field name.
Otherwise, the column name will be a key in each node's `properties` dictionary, that maps to the node's corresponding
value in the column.
* A Pandas ``DataFrame``, or iterable (eg. list) of DataFrames representing the relationships of the graph.
The rows of the DataFrame(s) should represent the individual relationships, and the columns should represent the
relationship IDs and properties. The columns map directly to fields of
:doc:`Relationship <./api-reference/relationship>`, and as such should follow the same naming conventions.
relationship IDs and attributes.
If a column shares the name with a field of :doc:`Relationship <./api-reference/relationship>`, the values it contains
will be set on corresponding relationships under that field name.
Otherwise, the column name will be a key in each node's `properties` dictionary, that maps to the node's corresponding
value in the column.

``from_dfs`` also takes an optional property, ``node_radius_min_max``, that can be used (and is used by default) to
scale the node sizes for the visualization.
Expand Down Expand Up @@ -97,11 +104,12 @@ The ``from_gds`` method takes two mandatory positional parameters:
* A ``Graph`` representing the projection that one wants to import.

We can also provide an optional ``size_property`` parameter, which should refer to a node property of the projection,
and will be used to determine the size of the nodes in the visualization.
and will be used to determine the sizes of the nodes in the visualization.

The ``additional_node_properties`` parameter is also optional, and should be a list of additional node properties of the
projection that you want to include in the visualization.
For example, these properties could be used to color the nodes, or give captions to them in the visualization.
For example, these properties could be used to color the nodes, or give captions to them in the visualization, or simply
included in the nodes' `Node.properties` maps without directly impacting the visualization.

The last optional property, ``node_radius_min_max``, can be used (and is used by default) to scale the node sizes for
the visualization.
Expand Down Expand Up @@ -143,7 +151,7 @@ We use the "pagerank" property to determine the size of the nodes, and the "comp

# Color the nodes by the `componentId` property, so that the nodes are
# colored by the connected component they belong to
VG.color_nodes("componentId")
VG.color_nodes(property="componentId")


Please see the :doc:`Visualizing Neo4j Graph Data Science (GDS) Graphs tutorial <./tutorials/gds-example>` for a
Expand All @@ -167,10 +175,10 @@ The ``from_neo4j`` method takes one mandatory positional parameters:

* A ``result`` representing the query result either in form of `neo4j.graph.Graph` or `neo4j.Result`.

The ``node_caption`` parameter is also optional, and indicates the value to use for the caption of each node in the visualization.
The ``node_caption`` parameter is also optional, and indicates the node property to use for the caption of each node in the visualization.

We can also provide an optional ``size_property`` parameter, which should refer to a node property of the projection,
and will be used to determine the size of the nodes in the visualization.
We can also provide an optional ``size_property`` parameter, which should refer to a node property,
and will be used to determine the sizes of the nodes in the visualization.

The last optional property, ``node_radius_min_max``, can be used (and is used by default) to scale the node sizes for
the visualization.
Expand Down
106 changes: 43 additions & 63 deletions examples/gds-example.ipynb

Large diffs are not rendered by default.

30 changes: 15 additions & 15 deletions examples/neo4j-example.ipynb

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions examples/snowpark-example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "2322065c",
"id": "887f41b7a243d439",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -215,7 +215,7 @@
"VG = from_dfs(products_df, parents_df)\n",
"\n",
"# Using the default Neo4j color scheme\n",
"VG.color_nodes(\"CATEGORY\")"
"VG.color_nodes(property=\"CATEGORY\")"
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion examples/streamlit-example.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def create_visualization_graph() -> VisualizationGraph:
nodes_df.drop(columns="features", inplace=True)

VG = from_dfs(nodes_df, rels_df)
VG.color_nodes("subject")
VG.color_nodes(property="subject")

return VG

Expand Down
17 changes: 16 additions & 1 deletion python-wrapper/src/neo4j_viz/colors.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,29 @@
from enum import Enum
from typing import Any, Union

import enum_tools
from pydantic_extra_types.color import ColorType

ColorsType = Union[dict[Any, ColorType], Iterable[ColorType]]


class PropertyType(Enum):
@enum_tools.documentation.document_enum
class ColorSpace(Enum):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer ValueSpace over ColorSpace.

Could at some doc string to clarify what we mean here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer ColorSpace as it also describes how one should interpret the colors provided to color_nodes. And a field/property can have float values, but you still might want to color it with a discrete space

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But yes, we should add doc strings

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done :)

"""
Describes the type of color space used by a color palette.
"""

DISCRETE = "discrete"
"""
This category describes a color palette that is a collection of different colors that are not necessarily related to
each other. Discrete color spaces are suitable for categorical data, where each unique category is represented by a
different color.
"""
CONTINUOUS = "continuous"
"""
This category describes a color palette that is a range/gradient of colors between two or more colors. Continuous
color spaces are suitable for continuous data (typically floats), where values can change smoothly.
"""


# Comes from https://neo4j.design/40a8cff71/p/5639c0-color/t/page-5639c0-79109681-33
Expand Down
13 changes: 11 additions & 2 deletions python-wrapper/src/neo4j_viz/gds.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import pandas as pd
from graphdatascience import Graph, GraphDataScience

from .pandas import from_dfs
from .pandas import _from_dfs
from .visualization_graph import VisualizationGraph


Expand Down Expand Up @@ -35,6 +35,11 @@ def from_gds(
"""
Create a VisualizationGraph from a GraphDataScience object and a Graph object.

All `additional_node_properties` will be included in the visualization graph.
If the properties are named as the fields of the `Node` class, they will be included as top level fields of the
created `Node` objects. Otherwise, they will be included in the `properties` dictionary.
Additionally, a new "labels" node property will be added, containing the node labels of the node.

Parameters
----------
gds : GraphDataScience
Expand Down Expand Up @@ -75,9 +80,13 @@ def from_gds(

node_props_df = pd.concat(node_dfs.values(), ignore_index=True, axis=0).drop_duplicates()
if size_property is not None:
if "size" in actual_node_properties and size_property != "size":
node_props_df.rename(columns={"size": "__size"}, inplace=True)
node_props_df.rename(columns={size_property: "size"}, inplace=True)

for lbl, df in node_dfs.items():
if "labels" in actual_node_properties:
df.rename(columns={"labels": "__labels"}, inplace=True)
df["labels"] = lbl

node_lbls_df = pd.concat([df[["id", "labels"]] for df in node_dfs.values()], ignore_index=True, axis=0)
Expand All @@ -88,4 +97,4 @@ def from_gds(
rel_df = _rel_df(gds, G)
rel_df.rename(columns={"sourceNodeId": "source", "targetNodeId": "target"}, inplace=True)

return from_dfs(node_df, rel_df, node_radius_min_max=node_radius_min_max)
return _from_dfs(node_df, rel_df, node_radius_min_max=node_radius_min_max, rename_properties={"__size": "size"})
Loading