Python – Use scikit-learn in Julia via PyCall

Use scikit-learn in Julia via PyCall… here is a solution to the problem.

Use scikit-learn in Julia via PyCall

I’m trying to use Scikit-learn| in Julia via PyCall

First, I tried reading iris data into a Julia data structure.

Here is the code in Python:

from sklearn import datasets
from sklearn.naive_bayes import GaussianNB

iris = datasets.load_iris()

X = iris.data
y = iris.target

The PyCall documentation says that Python methods are called in Julia, for example:

my_dna[:find]("ACT")

Relative to:

my_dna.find("ACT")

In Python.

My attempt to import iris data in Julia is:

using PyCall
@pyimport sklearn.datasets as datasets
@pyimport sklearn.naive_bayes as NB

iris = datasets.load_iris()

X = ...?
Y = ...?

The iris

= datasets.load_iris() call works if iris is of type Dict{Any,Any}.

I’m not sure if this is correct. I tried iris = datasets[:load_iris] but this resulted in:

ERROR: LoadError: MethodError: no method matching getindex(::Module, ::Symbol)

Further, How do I read iris.data and iris.target into X and Y?

Solution

As you said, Julia will tell you what type of iris is:

julia v0.5> @pyimport sklearn.datasets as datasets

julia v0.5> @pyimport sklearn.naive_bayes as NB

julia v0.5> iris = datasets.load_iris()
Dict{Any,Any} with 5 entries:
  "feature_names" => Any["sepal length (cm)","sepal width (cm)","petal length (...
  "target_names"  => PyObject array(['setosa', 'versicolor', 'virginica'], ...
  "data"          => [5.1 3.5 1.4 0.2; 4.9 3.0 1.4 0.2; ... ; 6.2 3.4 5.4 2.3; 5....
  "target"        => [0,0,0,0,0,0,0,0,0,0  ...  2,2,2,2,2,2,2,2,2,2]
  "DESCR"         => "Iris Plants Database\n====================\n\nNotes\n----...

It also tells you what the key is in the dictionary.
So now you just need to use Julia’s syntax to access the value in the dictionary (the result has been intercepted):

julia v0.5> X = iris["data"]
150×4 Array{Float64,2}:
 5.1  3.5  1.4  0.2
 4.9  3.0  1.4  0.2
 4.7  3.2  1.3  0.2

julia v0.5> Y = iris["target"]
150-element Array{Int64,1}:
 0
 0

Note that I don’t know the answer to this question. I just let Julia guide me in what to do.

Finally, as @ChrisRackauckas suggests, there is already a Julia package wrapped around scikit-learn: https://github.com/cstjean/ScikitLearn.jl

Related Problems and Solutions