S3, Cassandra or Outer Space? Dumping Time Series Data using Spark
Vast volume of our processed data is Time Series data and once you start working with distributed systems, you start tackling many scale and performance problems: How to handle missing data?Should I handle both serving and backed process or separating them out? Best Performance for Money? In the talk we will tell the tale of all of the transformations we’ve made to our data model@Windward, some of the problems we’ve handled, review the multiple data persistency layers like: S3, MongoDB, Apache Cassandra, MySQL. And I’ll try my best NOT to answer the question “Which one of them is the Best?"
Co-Founder and VP R&D @ Panorays, Demi has over 10 years of experience in building various systems both from the field of near real time applications and Big Data distributed systems. Co-Founder of the “Big Things” Big Data community and Google Developer Group Cloud. A software development groupie, Interested in tackling cutting edge technologies.