We set a goal to calculate Wins Above Replacement(WAR) players for each player in CPBL, which has never been done before. WAR is widely used in MLB as an index for salary negotiation for free agents. It includes not only the offensive ability of a player, but also the fielding position, fielding ability, and base running ability. Through the project, we hope to show the true value of each player, especially players with good fielding ability, and in turn cause the clubs and the league to appreciate the value of fielding and base running.
Complexity of the data may provide challenges for making valid causal inference about a scientific hypothesis (smoking vs. lung cancer mortality), particularly in the presence of unknown confounding (underlying lung function). Mendelian randomization (MR) addresses the issue of unknown confounding by using genetic information as an instrumental variable (IV) to estimate the causal effect of an exposure of interest on an outcome. Despite the popularity of IV analyses in fully observable outcomes, methodology is limited for time-to-event survival outcomes with censoring, a common data structure in biomedical sciences. We propose an IV analysis method in the survival context, estimating causal effects on a transformed survival time and the survival probabilities using semiparametric transformation models. We construct unbiased estimating equations to circumvent the difficulty in deriving joint likelihood of the exposure and the outcome, due to the unknown covariance by confounding. Asymptotic properties of the proposed estimators are established. We apply our methods to conduct an MR study for lung cancer survival, which suggests a harmful prognostic effect by smoking intensity (p=0.0067) that would have been missed by the conventional methods.
許多人都了解資料視覺化的優點，例如可將資料抽象化、讓資訊降維等等，讓數據以更直覺的方式呈現在使用者面前，提升閱讀者的吸收意願。然而，我們依然可以舉出許多例子，即使是視覺化後的數據，依然不容易讓人讀懂其價值，也難以勾起人們理解的意願。 『使用者經驗（UX）』設計的最終目標是『Don’t Make Me Think』，也就是打造一段順暢的閱讀過程，讓使用者能夠更輕鬆理解背後的寶貴價值，而這之中除了透過視覺設計、互動設計將結果優化之外，還包括了使用者研究、需求釐清、文案優化等等技巧，如果能夠將使用者經驗設計相關方法結合資料視覺化技巧，能夠再一次降低數據解讀的門檻，讓更多人了解其價值意涵，也讓更多人被數據所感動。
I believe the value of data science could reach a incredible level if we eliminate clear boundaries between different roles in a data science team. Every analyst is able to code and modeling, while every statistician is able to manipulate data and visualize data. Besides staying passionate about learning, another good way to is solving problems in work or on your own side projects. In this section, I would like to talk about the concept of a unicorn in data science, the skill sets, and a few practical problems.