Skip to content

Latest commit

 

History

History
67 lines (41 loc) · 2.57 KB

File metadata and controls

67 lines (41 loc) · 2.57 KB

Python basics for datascience

This repository contain notebooks covering Python basics for Data Science.

Below are few examples of tools available for implementing tasks for Data Science.

Programming Tools for Data Science

No Code Environments

H2o.ai, IBM Watson, Data Robot, ...

Spreadsheets, BI Tools

Microsoft Excel, Power BIm Google Sheets, Tableau, ...

Programming Languages

Weka, SAS, R, Python, MATLAB, Mathematica, ...

High Performance stacks

Hadopp, Spark, ...

Why Python

  1. Pyhton is good for both protyping and production. IT bridge the gap between data scientists and IT engineers. It is easy to learn and have features of full fledged programming language to implement solutions.

  2. Python is beginner friendly. Often time it is called execuatble pseudocode. It has simple syntax and high readability.

  3. Python is increasingly the default choice for data science. There is support of open source libraries which has become standard for data science.

  4. Python has strong adoption beyond data science. It has wide usage as scripting language, web deployment and programming IoT devices.

IPython and Jupyter

IPython provides a rich toolkit to help you make the most of using Python interactively. Its main components are:

  • A powerful interactive Python shell
  • A Jupyter kernel to work with Python code in Jupyter notebooks and other interactive frontends.

Project Jupyter exists to develop open-source software, open-standards, and services for interactive computing across dozens of programming languages. It has got The Jupyter Notebook, JupyterLab and Jupyterhub.For details check https://jupyter.org/index.html

The Jupyter Notebook

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.

JupyterLab 1.0: Jupyter’s Next-Generation Notebook Interface

JupyterLab is a web-based interactive development environment for Jupyter notebooks, code, and data.

Jupyterhub

A multi-user version of the notebook designed for companies, classrooms and research labs.

Popular Python Libraries

NumPy

The fundamental package for scientific computing with Python.

Pandas

Python Data Analysis Library.

SciPy

Python-based ecosystem of open-source software for mathematics, science, and engineering.

Scikit-learn

machine learning in Python.

Matplotlib

Comprehensive library for creating static, animated, and interactive visualizations in Python.

Seaborn

Data visualization library based on matplotlib.