I am a data scientist with a background in chemical engineering. I have five years of experience as a data analyst/production engineer for a chemical manufacturing facility. I am about to complete a Master’s degree in Data Analytics at USF.
A little about me technically, I am confident in pulling data from disparate sources into one convenient, easy-to-access location. My skills include: cleaning, analyzing, and visualizing large datasets in Python (numpy, pandas, scikit-learn), R, SQL, D3, Spark, SAS, Tableau, Microsoft Office products, and distributed systems on AWS. I have a strong understanding of many machine learning models and statistical analysis. I am a quick learner and willing to adapt to various technology and programming changes.
Personally, I am easy-going, work well in teams, and am motivated to continue working until I have a fantastic end product. I enjoy teaching others who are not technically trained about how data can help them make decisions. I am fluent in English and Korean. My favorite projects to work on so far have been data visualizations, maps, and recommendation systems. Please enjoy browsing through some of the projects I have completed thus far and feel free to contact me with any questions.
The beta distribution closely models the distribution of values expected in a specialty items auction. Ad agencies and other bidding platforms can use this distribution to estimate and maximize expected payout.
This is a case study determining the optimum strike price for dairy products. There are several mercantile exchanges with different measurement units, as well as varying state regulations. This model uses this information to forecast strike prices which effects monthly production rates.
Here is an interactive map showing the growth of bike parking locations over the past 15 years.
The data-set comes from the SF Data Portal: data.sfgov.org