Currently, I have kafka transfers client GEO data to Spark Streaming(Dataframe) calculating nearest...
I am getting Can't pickle local object '<lambda>.<locals>.<lambda>'error while...
I have a dataframe of the following format: name merged key1 (internalKey1,...
The RDD.sparkContext has setJobGroup: ...
On HDFS, I have my directories like...
I'm working on a Spark SQL project and I have a dataSet that contains a series of...
How spark laod data from HDFS in cluster How the blocks are converted into RDD. Lets say I have 3...
I have transaction dataset which I'm preparing by val df =...
I have a RDD[(Int, BreezeDenseMatrix[Double])] and what i want is to sum each row and multiply it...
My Spark scala code is like: val input = sc.newAPIHadoopRDD(jconf, classOf[CqlInputFormat],...
I am working on implementing the incremental process on hive table A; Table A - is already created...
I am using spark-cassandra-connector_2.11-2.0.0.jar to connect to Cassandra(version 2.1.9)....
I am trying to read a table on Postgres and insert the dataframe into a Hive table on HDFS in the...
I am using a nested data structure (array) to store multivalued attributes for Spark table. I am...
Created a spark structured streaming application using spring boot. The bootRun works fine, but...
I have the following dataframe schema: root |-- firstname: string (nullable = true) |--...
We are consuming financial quotes from different exchanges and we want to have a possibility to...
This question already has an answer here: What is the difference...
I need to filter only the text that is starting from > in a column.I know there are functions...
I am using spark 2.2 and I am trying to read a dataset from a tsv file like the following in...
I have an aggregation as below: I use the structured streaming to get all product ids and their...
I have a list of blobs (wasbs url) in a structured streaming data frame and want to read all the...
What is the max size of spark.broadcast(var) where var is a numpy array? I saw this...
I am trying to connect to a Vertica dB with Spark v2.3.1 Scala 2.11.8 using jdbc. On the Vertica...
Spark has a useful API for accumulating data in a thread safe way...
I want to select few columns from a DF. Between the columns I need to add different spaces as end...
I have a keras deep learning model and I have to now process a large dataset over it and calculate...
Yesterday (practically the full journal) I tried to figure out an elegant way to represent a model...
Is there any Machine Learning algorithm that can generate Spark code depening on Input. I have...
My spark Kafka Direct Stream job when started will only consume messages which were produced before...