S3, Cassandra or Outer Space? Dumping Time Series Data using Spark
Vast volume of our processed data is Time Series data and once you start working with distributed systems, you start tackling many scale and performance problems: How to handle missing data?Should I handle both serving and backed process or separating them out? Best Performance for Money? In the talk we will tell the tale of all of the transformations we’ve made to our data model@Windward, some of the problems we’ve handled, review the multiple data persistency layers like: S3, MongoDB, Apache Cassandra, MySQL. And I’ll try my best NOT to answer the question “Which one of them is the Best?"
Co-Founder and VP R&D @ Panorays, Google Developer Expert (GDE), Software engineer, Entrepreneur and an International Tech Speaker Demi has over 10 years of experience in building various systems both from the field of near real time applications and Big Data distributed systems. Co-Founder of the “Big Things” Big Data community and Google Developer Group Cloud. Big Data Expert, but interested in all kinds of technologies, from front-end to backend, whatever moves data around.