Pranav 8c6f1e8b1e Add All Projects 6 年 前
..
data 8c6f1e8b1e Add All Projects 6 年 前
Investigating_Lahman_Baseball_Database.html 8c6f1e8b1e Add All Projects 6 年 前
Investigating_Lahman_Baseball_Database.ipynb 8c6f1e8b1e Add All Projects 6 年 前
README.md 8c6f1e8b1e Add All Projects 6 年 前

README.md

P2: Investigating Lahman's Baseball Database

Investigation of a curated dataset sourced from Lahman's Baseball Database is done to study player-performance metrics for their significance in match-winning contributions & player's salary.

About

In this project, Lahman's Baseball Database is analyzed with the help of Python libraries like NumPy, Pandas & Matplotlib. The findings are reported in a Jupyter notebook.

The activities implemented in this project are:

  1. Choose a dataset from the provided list.

  2. Go-through the dataset and brainstorm the questions that can be answered using it.

  3. Use NumPy, Pandas, and Matplotlib to answer these questions.

  4. Summarize the findings.

Learning Outcome

The project helped me understand the steps involved in a typical data analysis process – mainly learning to pose questions that can be answered with a given dataset and then answering those questions.

On the technical front, I learned to use vectorized operations in NumPy and Pandas to speed up data analysis code, be familiar with Pandas' Series and DataFrame objects, and use Matplotlib to produce plots showing.

Files

  • data – Directory containing data.

  • Investigating_Lahman_Baseball_Database.ipynb – Main project file.

  • Investigating_Lahman_Baseball_Database.html – HTML export of the project notebook.

Requirements

This project requires Python 3 with NumPy, Pandas, Matplotlib & Seaborn.

It is recommended to use Anaconda, a pre-packaged Python distribution that contains all of the necessary libraries and software for this project.

License

Modified MIT License © Pranav Suri