Google Launches Cloud Dataproc, A Managed Spark And Hadoop Big Data ServiceGoogle is adding another product in its range of big data services on the Google Cloud Platform today. The new Google Cloud Dataproc service, which is now in beta, sits between managing the Spark data processing engine or Hadoop framework directly on virtual machines and a fully managed service like Cloud Dataflow, which lets you orchestrate your data pipelines on Google’s platform.
Dataproc users will be able to spin up a Hadoop cluster in under 90 seconds — significantly faster than other services — and Google will only charge 1 cent per virtual CPU/hour in the cluster. That’s on top of the usual cost of running virtual machines and data storage, but as DeMichillie noted, you can add Google’s cheaper preemptible instances to your cluster to save a bit on compute costs. Billing is per-minute, with a 10-minute minimum.
“In this space, there is no one-size that fits for all,” DeMichillie said. “We think this is going to be a really important addition to our overall portfolio.”