Here I share my…
- previous works and projects
- programming tips (R, python, SAS and SQL)
- learning of statistics and machine learning
- thoughts about data science
[Resume] [Github repos]
Projects
ETL and EDA
- Horseshoe Crab: An Exploratory Data Analysis
- Movie Database API Query
- Summarize Student and World Development Data (Python)
- SQL Queries on MLB Database (Python)
- Northwind Salesmen Database: Sale Data Analysis (Python)
- NFL: Pandas on Spark (Python)
- Spark Data Streaming (Python)
Statistical Inference
- Parallel Computing: Monte Carlo Study for t-test
- Chi Square Test for Homogeneity: Likelihood Ratio Test vs. Monte Carlo Simulation
- Monte Carlo Simulation Study for Estimators and CI Performance
- Estimate Variances of Model Parameters Using Perturbed SSE Curve Fitting (PSCF) Method
- NFL: MapReduce (Python)
Machine Learning
- Multiple Linear Regression (MLR) and Logistic Models
- KNN and Tree Based Ensemble Models
- Bayesian MCMC Sampling for a Logistic Regression Model
- Automating Machine Learning Reports
- Diamond: Classification Models (Python)
- Online Shoppers Purchasing Intention: Logistic Regression, Decision Tree and Random Forests (Python)
- Bike Sales: EDA and Prdictions using Grid Search vs. Gradient Descent Algorithm (Python)
Shiny Apps
Posts
Interactive Dashboards
A Productive Journey
Machine Learning with R
Automation on Reproting
Coolest Things in R
Some Thoughts Aboout Using API
Programming Background
Build a Blog site with GitHub Page
Replicate vs. Repeat
Data Science: To be or not to be
subscribe via RSS