Spark and Cassandra are implemented through Python… here is a solution to the problem.
Spark and Cassandra are implemented through Python
I
have a lot of data stored in Cassandra and I want to use Spark to handle it through Python.
I was just wondering how to interconnect Spark and Cassandra via Python.
I’ve seen people use sc.cassandraTable but it doesn’t work, it doesn’t make sense to get all the data from cassandra at once and then feed it to spark.
Any suggestions?
Solution
Have you tried the examples in the documentation.
Spark Cassandra Connector Python Documentation
spark.read\
.format("org.apache.spark.sql.cassandra")\
.options(table="kv", keyspace="test")\
.load().show()