wish help you to fix your issue You understanding about micro-batch vs stream processing is correct. You are also right, that all three system use the standard Java consumer that is provided by Kafka to pull data for processing in an infinite loop. The main difference is, that Spark needs to schedule a new job for each micro batch it processes. And this scheduling overhead in quite high, such that Spark cannot handle very low batch intervals like 100ms or 50ms efficiently and thus throughput goes down for those small batches.
Exception when processing data during Kafka stream process
may help you . You have to specify the correct Serdes for the to() operation, too. Otherwise, it uses the default Serdes from the StreamsConfig and this ByteArraySerde -- and String cannot be cast to byte. You need to do:
help you fix your problem Here are two solutions. They are more-or-less equivalent in their underlying behavior, but you might find one or the other easier to understand, maintain, or test. As for your question, no, there is no way to loop back (re-queue) the unconsumed events without pushing them back to Kinesis. But simply holding on to them until they are needed should be fine.