Metabase Presto

Posted on  by 



  • 上一篇记录了怎么安装 presto 引擎集成 kudu 数据(presto 安装集成 kudu),现在需要将数据展示出来,供其他的研究人员做数据分析,这里比较好用的工具是 metabse,可以集成多种数据库。 使用起来也是很方便的,这里使用d.
  • 开源 BI,我最终选择了 Metabase,如果你也有开源 BI 需求,以上浅薄见解希望对你有所帮助。如果有什么问题欢迎留言交流。 最后附上 Metabase 之禅,个人认为这也是作为开发者应当追求的准则,共勉之.
  • To summarize, we can say assembly language is a level up from binary language. The difference is machine language executed directly by CPU whereas machine language is first converted to binary by the compiler and then executed by CPU.

Metabase is an open-source data visualization and business intelligence tool that helps you use data to answer questions about business processes and make educated decisions. Traditionally, BI tools can cost a fortune and businesses need to pay either for expensive enterprise software licensing, employ a cadre of data scientists and data engineers, or both. Today, the value of analytics continues to grow, but the introduction of open-source software such as Metabase has lowered the barrier to entry significantly.

Metabase Presto Mini

To summarize, we can say assembly language is a level up from binary language. The difference is machine language executed directly by CPU whereas machine language is first converted to binary by the compiler and then executed by CPU. See Metabase installation. This page describes a way to connect to Treasure Data Presto from Metabase with Mac OS X application. Setup Presto connection. Install Metabase. Navigate to the Databases page on the Admin menu. Click Add database.

This guide will explain how to install Metabase locally using Docker, probably the easiest and fastest way to install it.

1. Checking if we have docker on our local machine. If you see something like Client: Version: 19.03.8 which means you have docker running if not you can follow this guide.

2. Now let’s pull a docker Metabase image.

3. And run the image

Metabase presto

4. If you use VirtualBox/Ubuntu (or other Linux) as your main working environment you will need to do port forwarding on port 3000.

5. That is it! Now, we can access our Metabase in your browser by calling this page, localhost:3000. You will be asked to compute a few steps to create an account and initial setup. At this step you can skip connection to the database if you don’t have any DB available, Metabase provides a sample database so you can start using the tool right away. Metabase supports the vast majority of modern SQL and NoSQL databases. The is the compute list of currently supported databases.

  • BigQuery
  • Druid
  • Google Analytics
  • H2
  • MongoDB
  • MySQL/MariaDB
  • PostgreSQL
  • Presto
  • Amazon Redshift
  • Snowflake
  • Spark SQL
  • SQLite
  • SQL Server
  • Vertica

Metabase reminds me of Tableau but with a simpler and intuitive interface. It allows you to use SQL to slice and dice your data and visualize your query right away. There is a dashboard I was able to create after playing with the tool for not more than 30 minutes. I really like the intuitiveness of the tool and smooth transaction from SQL query to visualization that can empower users to extract valuable insights from the data.

Metabase presto logMetabase presto plus

And SQL interface looks very simple and clean, it allows you to access schemas of each table without running any queries like most of RDBMS.

Data visualization dashboards (aka BI tools) are an essential piece for the success of every data analytics project - whether it is using big data technologies or traditional data warehousing approach. Earlier this space has been populated primarily by paid BI (Business intelligence) tools like Tableau, Micro Strategy etc, but lately lot of open source alternative are arising with noticeable ones of Redash, Superset and Metabase (Another notable tool is Kibana, but its backend support is limited to Elasticsearch and hence not a general purpose BI tool)

It is still early days for these open source dashboards, but they provide a very attractive proposition for internal dashboards already. Here is a quick comparison of Superset vs Redash vs Metabase.

Please note that lot of startups have already been successfully using these 3 dashboards :)

We evaluate these 3 open source BI tools (dashboards) on 4 broader features - 1) Data backend support, 2) Authentication / Authorization support, 3) Support for Scheduled reports by email and Alerts, 4) extension support.

Metabase Presto 3

Superset vs Redash vs Metabase - Data Backend Support

All three tools now support all major sql backends used for data analytics workloads - e.g., Amazon redshift, Postgres, MySql, SQL Server, MongoDB and Oracle. Only few support big data processing backend like Presto, Hive, SparkSQL, Google BigQuery, Elasticsearch currently, but soon all three of them should have support for all these popular backends.

Data Backend Redash SuperSet Metabase
MySql Yes Yes Yes
PostgreSQL Yes Yes Yes
Oracle Yes Yes Yes
SQL Server Yes Yes Yes
MongoDB Yes Yes Yes
Amazon Redshift Yes Yes Yes
Cassandara Yes Yes ?
Presto Yes Yes ?
Hive Yes Yes ?
Impala Yes Yes ?
SparkSql ? Yes ?
Google BigQuery Yes Yes Yes
Graphite Yes ? ?
ElasticSearch Yes ? ?
Vertica Yes Yes Yes
Druid No Yes Yes

Superset vs Redash vs Metabase - Authentication support

Currently Superset supports much richer authentication backend compared to Redash and Metabase who only support Google Oauth for authentication and single sign on.

So if you need to integrate with your in-house ldap or database based authentication backend, currently the only solution is SuperSet.

Metabase
Authentication Backend Redash SuperSet Metabase
Google Oauth Yes Yes Yes
OauthNo Yes No
OpenID No Yes No
LDAPNo Yes No
Database No Yes No

Superset vs Redash vs Metabase - Authorization / Access control support

All three tools supports a decent permission (authorization) model to allow group of users access to particular data and queries. This allows organization to restrict data access based on different user roles.

Please note that in all three data access granularity is primarily based on database table level and can't be go beyond that. Though it is typically sufficient for most practical use cases. If not sufficient, existing data need to be split between tables to ensure different access level.

Both Redash and Metabase supports concept of users and groups and then allow one to control what level of database and SQL access those groups should have. A user can be a member of multiple groups.

Metabase Presto Pro

Superset supports concept of Admins and Gamma users. Gamma users can be assigned multiple roles each controlling access to particular data and queries. Roles can be made quite intricate to who can access individual features and which. dataset

Metabase Presto Jdbc

Superset vs Redash vs Metabase - Support for Scheduled Emails and Alerts

Scheduled Emails with summary reports and Alerts are another very useful feature of Data dashboards.

Alerts Redash SuperSet Metabase
Summary Email (Scheduled) No No Yes
Alerts Support Yes No No
Slack Integration Yes No Yes

Currently only Redash supports alerts based on certain parameter crossing a particular threshold.

Superset vs Redash vs Metabase - Extending platform

Being open source, one can easily extend these tools if one need to.

Metabase Presto

Tool Tech
Redash Python
SuperSet Python
Metabase Clojure

Redash and SuperSet are developed in Python while Metabase is developed in Clojure. If you have a particular technology talent in-house, then this also can be a plus point in deciding the right tool for your organization.

Summary

This article is still in progress and we will update this article as these tool make progress. (+ add comparison on more features)

All three tools have been providing a decent dashboard experience and we are extremely thankful for their developers for creating a much needed open source BI visualization tool.





Coments are closed