Python – How to set the schema for janusgraph for bulk loading using python

How to set the schema for janusgraph for bulk loading using python… here is a solution to the problem.

How to set the schema for janusgraph for bulk loading using python

I’m trying to bulk load data into janusgraph 0.2 via HBase. I’m using Python’s GremlinPython library. For bulk loading, I set storage.batch-loading to true, now the schema must be defined for the graph.

I found documentation that sets the schema for the graph (https://docs.janusgraph.org/0.2.0/schema.html and https://docs.janusgraph.org/0.2.0/advanced-schema.html)。

It suggests some basic pattern code:

mgmt = graph.openManagement()
follow = mgmt.makeEdgeLabel('follow').multiplicity(MULTI).make()
mother = mgmt.makeEdgeLabel('mother').multiplicity(MANY2ONE).make()
mgmt.commit()

I use Python’s GremlinPython library to connect to graphics. That’s what I’m doing :

from    gremlin_python                                  import  statics
from    gremlin_python.structure.graph                  import  Graph
from    gremlin_python.process.graph_traversal          import  __
from    gremlin_python.process.strategies               import  *
from    gremlin_python.driver.driver_remote_connection  import      DriverRemoteConnection
from    gremlin_python.process.traversal                import  T
from    gremlin_python.process.traversal                import  Order
from    gremlin_python.process.traversal                import  Cardinality
from    gremlin_python.process.traversal                import  Column
from    gremlin_python.process.traversal                import  Direction
from    gremlin_python.process.traversal                import  Operator
from    gremlin_python.process.traversal                import  P
from    gremlin_python.process.traversal                import  Pop
from    gremlin_python.process.traversal                import  Scope
from    gremlin_python.process.traversal                import  Barrier

from    config                                          import  graph_url, graph_name

graph = Graph()
drc = DriverRemoteConnection(graph_url, graph_name)

g = graph.traversal().withRemote(drc)

# I successfully get g here, I check it by :
# g.V().count().next()

Now my question is, where should I set up the schema. I tried doing mgmt=graph.openManagement() after the commented out line, but it didn’t work.


Update

It works on the gremlin console as follows:

gremlin> :remote connect tinkerpop.server conf/remote.yaml
==>Configured localhost/127.0.0.1:8182
gremlin> 
gremlin> :> mgmt = graph.openManagement()
==>org.janusgraph.graphdb.database.management.ManagementSystem@625dfab4 

But any further commands don’t work here :

:> follow = mgmt.makeEdgeLabel('follow').multiplicity(MULTI).make()
No such property: mgmt for class: Script10

Solution

gremlinpython driver is Gremlin Language Variant (GLV), which allows you to use Gremlin natively in the programming language Python. The JanusGraph schema definition is specific to the JanusGraph database, but gremlinpython GLV is a generic TinkerPop driver, so it has no structure for calling database-specific APIs.

As you mentioned, you can declare your schema through the Gremlin console. Another option is to use a string-based Gremlin driver, such as gremlinclient or gremlinpy and send your schema as a string query to the server.

Related Problems and Solutions