Getting Started with Python| Development environment

Goal of this Python Course is to demonstrate how the basics of Python and how they can be applied.

Table of contents

  1. Python Course-Part 01-Getting Started with Python| Development environment
  2. Python Course-Part 02-Getting Started with Python| Basics
  3. Python Course-Part 03-Data Types | Data Structures
  4. Python Course-Part 04-Control Structures
  5. Python Course-Part 05-Functions
  6. Python Course-Part 06-Database Access

1. What is Python?

Python is an interpreted, object-oriented, high-level programming language created by Guido van Rossum in 1989.

2. Why learn Python?

There exists a lot of commercial tools, which provide functions for solving data science problems. Some of the drawbacks of this tools are that the analysis options are limited to the available functions. Data science projects does not only require standard routines. Instead, each data science projects have his own requirements and characteristics. To obtain a maximum of knowledge from the given dataset, individual functions must be implemented. This is with commercial tools very expensive or impossible.

Data Scientist who are familiar with basic programming skills are able to implement individual solution for individual problems. Not all programming languages are equally well suited for efficient data analysis.

The following properties of python are important for data science:

  • Python is a very popular programming language due the fact that is easy to learn and intuitive to apply
  • For fast filtering, sorting and storage of large datasets an efficient data structures is provided
  • Comprehensive libraries of functions from data science disciplines, like machine learning, deep learning, visualization, business intelligence, statistical modelling, natural language processing (and more), are available for Python

Other characteristics of python are:

  • it is an object orientated programming language but also supports other programming paradigms like functional programmingor procedural programming
  • it is cross-platform capable. It run the programs on Windows, MAC and Linux
  • it is supported by a large community

Python enables programs to be written more compactly. Programs written in Python are typically much shorter than equivalent Java or C ++ programs.

The main reasons for this are

  • no variable or argument declarations are necessary
  • the high-level data types allow you to express complex operations in a single statement

Another advantage of Python is that solving a task with fewer lines of codes implies that the code is more robust an less error-prone.

3. Installation and Distributions

3.1 Anaconda

Anaconda is the leading open data science platform for R and Python. The distribution nearly 100 packages. Additionally more packages can easily be installed and updated by Anaconda’s internal environment-, dependancy- and package-manager conda.

Once Anaconda has been downloaded and installed, the included Anaconda Navigator can be used to start programming. As shown in the picture below, from the homeview of the Anaconda navigator one can launch or install development environments such as Jupyter Notebooks

3.2 Pycharm

PyCharm is an integrated development environment (IDE) used in computer programming, specifically for the Python language. It is developed by the Czech company JetBrains. It provides code analysis, a graphical debugger, an integrated unit tester, integration with version control systems (VCSes), and supports web development with Django. PyCharm is cross-platform, with Windows, macOS and Linux versions. The Community Edition is released under the Apache License and there is also Professional Edition with extra features released under a proprietary license.

3.3 Jupyter Notebooks

Jupyter Notebook are less applicable for developing complex Python-projects. Moreover, they are well suited for experiments and data analysis tasks. For me Jupyter Notebooks are the best platform for efficient data analysis , integrated visualisation and documentation tasks.

Jupyter Notebook is an open-source web application that allows to create and share documents that contain live code, explaination text for source codes, images and much more…

4. Anaconda — My first Jupyter Notebook

  1. Open Anaconda Navigator and switch to the virtual environment „base(root)“ (you can also create a new enviroment)

2. Launch Jupyter Notebook

3. Go to your home directory and create a new Python notebook in your directory. Rename the notebook (File -> Rename)

4. First we write some text in a markdown-cell

Jupyter Notebook has a lot of shortcuts and two different keyboard input modes. Edit mode allows you to type code or text into a cell and is indicated by a green cell border. Command mode binds the keyboard to notebook level commands and is indicated by a grey cell border with a blue left margin. ( Help -> Keyboard Shortcuts)

Formating is done with Markdowns. (Introduction by John Gruber)

5. Write “Hello world” into a code cell and execute it

Click on the “run button” or press the shortcut shift+ Enter (run selected cell) / ctr + Enter (run all cells)

5. Jupyter Notebook Extensions

Jupyter Notebook extensions are add-ons that extend the basic functionality of the notebook environment.

5.1 Install pip

5.1 Install conda

Start up a Jupyter Notebook and navigate to the new Nbextensions tab:

Enable the extensions you want and enjoy the benefits. If you don´t see a tab you can open a new notebook and click Edit ->nbextensions config

The sourcecode is available at my GitHub repository.

Next Python Course-Part 02-Getting Started with Python| Basics

Data Analyst