Missed the Databricks Data+AI Summit 2022? Here's a quick recap of some important announcements from the event, by our data engineers, Anish Ninan and Zula Jariwala.
Transcript
Hi, my name is Anish Ninan and I'm a Data Engineer here at Valorem Reply. So yeah, I was really excited to attend the Databricks conference in San Francisco. It was a great experience, met a lot of great people, learned a lot of new things, and I think the data space is going to be exciting for the next few years. So, one of the big announcements was Delta [Lake] 2.0 and it being open source, so Databricks continues to open-source key parts of the technology and making it available to them to the wider community. So, this really improves deletes and updates of single rows, so that's something which is very important.
Next was a Spark Connect. This makes development from IDEs much easier. Now we can submit [Apache] Spark jobs through VS code and other programs, lightweight applications etc.
The next was Delta sharing and Cleanrooms. This is part of delta sharing. Provides a wave of tightly controlling access to your shared data set. It also includes things like external query approval, which means you have total visibility of the data and aggregations of people trying to view.
The next one was Databricks SQL Serverless. Right now, when running a query for the first time, it takes around 5 minutes to start up, so this decreases startup time to around 10 seconds and the other thing is it improves cost pair on 40% too.
So, those are the key announcements that they made.
Transcript
I'm Zula Jariwala and I am working as a Data Engineer in Valorem Reply. And I have attended the Databricks Conference recently and it was a really amazing experience and they made big announcements. So, one of the big announcements in the Databricks and AI Summit was for the Unity Catalog. So Unity Catalog is now going to be generally available to the audience in a couple of weeks. It's still not available, but it's going to be released very soon, like in couple of weeks. So, with the Unity Catalog, to unify the governance for all the data and AI assets, including the dashboard, load boards and the machine learning models with the common governance models across or across the lake house and across the cloud. So, which will be useful for the performance and the security.
Yeah, so another, I would say... the Databricks Marketplace. So, with the Databricks Marketplace, users [can] publish their app or publish their codes with other users, so it can be reusable. So, it will be published to the marketplace and general audience can use each other’s codes. It will be like a collaborative platform that people can use through the Databricks Marketplace.
They announced MLflow 2.0, that's coming up and with the MLflow 2.0, similar to the CI/CD pipeline, you can create the MLflow pipeline for the production work, that will make user’s production work very much easier.
Valorem Reply’s Databricks certified professionals work closely with Databricks Unified Analytics experts, architects, engineers and sales resources to customize data roadmaps and strategies specific to our client’s environment and challenges. Our partnership with Databricks also allows us to participate in the Customer Advisory Board to help mold the future of this exciting technology and its ability to meet growing data needs of some of the world’s top organizations. Contact us to learn more about how our training, advisory and deployment solutions can help you create a fully managed, scalable, and secure cloud infrastructure that reduces operational complexity and total cost of data ownership for your organization.