it fixes the issue I'd say the best practice would be to run spark and hadoop on the same cluster. In fact, spark can run as a yarn application (if you do spark-submit with --master yarn client). Why ? It boils down to data locality. Data Locality is a fundamental concept of hadoop and data systems in general. The general idea is that the data you want to process it's so big so rather than moving the data, you'd rather move the program to the node the data resides on. So, in the case of spark, if you run it on a different cluster, all the data will have to be moved from a cluster to another through the network. It's more efficient to have computation and data on the same node. As for version, having two hadoop clusters with different versions can be a pain. I'd recommend you have 2 different installations of spark, one per cluster, compiled for the appropriate version of hadoop.
wish help you to fix your issue After spending some time around this issue, I managed to find the solution. It has nothing to do with the query. There was another SparkClient process running, once I stopped it and executed the query it works fine.
Comparing Cassandra's CQL vs Spark/Shark queries vs Hive/Hadoop (DSE version)