I’m a data scientist in the Twin Cities. I got into data science through chemical engineering, so, you know, the usual way. My research wasn’t very conducive to data science, so I kind of made my own path. I co-founded Penn Data Science Group, applied as many good coding and data analysis practices as I could to my research, and ran a couple tutorials/workshops to share useful tools (like Jupyter, Pandas, and Git) in PDSG events and research lab meetings. I did a few side projects to pick up skills outside of research. Here are some of the projects I had the most fun with:
- PBCluster: This is the only one that actually had to do with my research. I put together a Python package that I don’t think anyone but me will use, but I had fun doing it and learning how to put together a tested, object-oriented, documented, pip-installable package that helped at least 1 person (me) with their research.
- Collaborative Filtering Methods Comparison: A detailed blog post comparing several collaborative filtering models for movie recommendation with the MovieLens dataset.
- Citadel Data Open Championship: My team’s report from the final round of a national datathon analyzing education data
- Interactive Baby Name Popularity Map: An interactive map made with D3.js that lets you explore baby name popularity by time and location.
- NFL Fantasy Draft Dashboard: A dashboard made with Plotly Dash just for fun to try to help with a 2018 NFL fantasy draft. Turned out to be useless, but it was fun putting it together.
So many people helped me in my career development and job search process leading up to landing an exciting data science job, so I’m always happy to (try to) pay it forward with advice, encouragement, or connections.