Pranav 8c6f1e8b1e Add All Projects 6 years ago
..
data 8c6f1e8b1e Add All Projects 6 years ago
Investigating_Lahman_Baseball_Database.html 8c6f1e8b1e Add All Projects 6 years ago
Investigating_Lahman_Baseball_Database.ipynb 8c6f1e8b1e Add All Projects 6 years ago
README.md 8c6f1e8b1e Add All Projects 6 years ago

README.md

P2: Investigating Lahman's Baseball Database

Investigation of a curated dataset sourced from Lahman's Baseball Database is done to study player-performance metrics for their significance in match-winning contributions & player's salary.

About

In this project, Lahman's Baseball Database is analyzed with the help of Python libraries like NumPy, Pandas & Matplotlib. The findings are reported in a Jupyter notebook.

The activities implemented in this project are:

  1. Choose a dataset from the provided list.

  2. Go-through the dataset and brainstorm the questions that can be answered using it.

  3. Use NumPy, Pandas, and Matplotlib to answer these questions.

  4. Summarize the findings.

Learning Outcome

The project helped me understand the steps involved in a typical data analysis process – mainly learning to pose questions that can be answered with a given dataset and then answering those questions.

On the technical front, I learned to use vectorized operations in NumPy and Pandas to speed up data analysis code, be familiar with Pandas' Series and DataFrame objects, and use Matplotlib to produce plots showing.

Files

  • data – Directory containing data.

  • Investigating_Lahman_Baseball_Database.ipynb – Main project file.

  • Investigating_Lahman_Baseball_Database.html – HTML export of the project notebook.

Requirements

This project requires Python 3 with NumPy, Pandas, Matplotlib & Seaborn.

It is recommended to use Anaconda, a pre-packaged Python distribution that contains all of the necessary libraries and software for this project.

License

Modified MIT License © Pranav Suri