How to set the schema for janusgraph for bulk loading using python
I’m trying to bulk load data into janusgraph 0.2 via HBase. I’m using Python’s GremlinPython library. For bulk loading, I set storage.batch-loading
to true
, now the schema must be defined for the graph.
I found documentation that sets the schema for the graph (https://docs.janusgraph.org/0.2.0/schema.html and https://docs.janusgraph.org/0.2.0/advanced-schema.html)。
It suggests some basic pattern code:
mgmt = graph.openManagement()
follow = mgmt.makeEdgeLabel('follow').multiplicity(MULTI).make()
mother = mgmt.makeEdgeLabel('mother').multiplicity(MANY2ONE).make()
mgmt.commit()
I use Python’s GremlinPython library to connect to graphics. That’s what I’m doing :
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.process.traversal import T
from gremlin_python.process.traversal import Order
from gremlin_python.process.traversal import Cardinality
from gremlin_python.process.traversal import Column
from gremlin_python.process.traversal import Direction
from gremlin_python.process.traversal import Operator
from gremlin_python.process.traversal import P
from gremlin_python.process.traversal import Pop
from gremlin_python.process.traversal import Scope
from gremlin_python.process.traversal import Barrier
from config import graph_url, graph_name
graph = Graph()
drc = DriverRemoteConnection(graph_url, graph_name)
g = graph.traversal().withRemote(drc)
# I successfully get g here, I check it by :
# g.V().count().next()
Now my question is, where should I set up the schema. I tried doing mgmt=graph.openManagement()
after the commented out line, but it didn’t work.
Update
It works on the gremlin console as follows:
gremlin> :remote connect tinkerpop.server conf/remote.yaml
==>Configured localhost/127.0.0.1:8182
gremlin>
gremlin> :> mgmt = graph.openManagement()
==>org.janusgraph.graphdb.database.management.ManagementSystem@625dfab4
But any further commands don’t work here :
:> follow = mgmt.makeEdgeLabel('follow').multiplicity(MULTI).make()
No such property: mgmt for class: Script10
Solution
gremlinpython driver is Gremlin Language Variant (GLV), which allows you to use Gremlin natively in the programming language Python. The JanusGraph schema definition is specific to the JanusGraph database, but gremlinpython GLV is a generic TinkerPop driver, so it has no structure for calling database-specific APIs.
As you mentioned, you can declare your schema through the Gremlin console. Another option is to use a string-based Gremlin driver, such as gremlinclient or gremlinpy and send your schema as a string query to the server.