Web

TL;DR - ES2020: Nullish Coalescing

Nullish coalescing (??) adds the ability to truly check nullish values instead of falsey values.

TL;DR - ES2020: Nullish Coalescing

Web

TL;DR - ES2020: Optional chaining

Long chains of property accesses in JavaScript can be error-prone, as any of them might evaluate to null or undefined (also known as "nullish" values). Some other languages offer an elegant solution to this problem with using a feature called "optional chaining".

TL;DR - ES2020: Optional chaining

Data Engineer

Scheduling Python script in Airflow

To schedule a Python script or Python function in Airflow, we use `PythonOperator`.

Scheduling Python script in Airflow

Spark History Server on Kubernetes

The problem with running Spark on Kubernetes is the logs go away once the job completes. Spark has tool called the Spark History Server that provides a UI for your past Spark jobs. In this post, I will show you how to use Spark History Server on Kubernetes.

Spark History Server on Kubernetes

3 ways to run Spark on Kubernetes

Spark can run on clusters managed by Kubernetes. This feature makes use of native Kubernetes scheduler that has been added to Spark.

3 ways to run Spark on Kubernetes

Data Engineer

Airflow DAG Serialization

In order to make Airflow Webserver stateless, Airflow >=1.10.7 supports DAG Serialization and DB Persistence.

Airflow DAG Serialization

Data Engineer

Data Studio: Connecting BigQuery and Google Sheets to help with hefty data analysis

Normally, with BigQuery as a data source of Data Studio, users (of Data Studio Dashboard) might end up generating a lot of queries on your behalf — and that means you can end up with a huge BigQuery bill. It’s taken so long to refresh data when you change something in development mode. How to solve this problem with Spreadsheet, for free?

Data Studio: Connecting BigQuery and Google Sheets to help with hefty data analysis

TL;DR - khi nào nên sử dụng Random Forest thay vì Neural Network

Cả Random Forest và Neural Networks đều là những kỹ thuật khác nhau nhưng có thể sử dụng chung ở một số lĩnh vực. Vậy khi nào sử dụng 1 kỹ thuật thay vì cái còn lại?

TL;DR - khi nào nên sử dụng Random Forest thay vì Neural Network