Starburst Brings Dataframes Into Trino Platform

Starburst customers who prefer to manipulate data using dataframes as opposed to regular SQL will be happy with a pair of announcements made today. That includes the introduction of PyStarburst, which provides a PySpark-like syntax for transforming data residing in Starburst’s hosted Galaxy environment, as well as support for Ibis, a portable dataframe library developed…

Read More

In Search of Data Model Repeatability

Everybody wants to be data-driven–that much is clear. But that desire doesn’t necessarily translate into real business results, especially in competitive industries like ecommerce. Data quality has long been a burr in the side of would-be data champions. The need to cleanse and normalize dirty and inconsistent data often consumes the lion’s share of the…

Read More

TikTok Parent Open Sources Real-Time Data Warehouse

You might not yet be a major TikTok influencer, but you can still analyze data like TikTok’s parent company, ByteDance, which recently released its real-time data warehouse architecture as open source. ByConity, the name of ByteDance’s data warehouse, is an elastically scalable, column-oriented relational database that’s based on ClickHouse, the scalable, open-source database that the…

Read More