In researching some aspects of data science, I ran across this article Python Displacing R As The Programming Language For Data Science. Instead of using a domain specific language like R, people are preferring to use a general purpose language for the same functions, so I wanted to get familiar with the basics of Python.
Installing Python:
We need to install Python, as well as an editor or IDE for coding. JetBrains (the creators of ReSharper) have a Python IDE PyCharm, and they make a community edition available.
I went with the latest edition of Python (3.4.1), although the version 2 is still available.
Download Python 3.4.1
IDE – Jetbrains – PyCharm
I’m a big fan of using Chocolatey for installs, so the commands for the two needed components are:
cinst PyCharm-community
cinst python
Getting Started:
Python is an interpreted, dynamically typed and strongly typed language. It is case sensitive and everything is an object.
Comments are preceded with a #.
To get help on an object: help(object).
dir(object) will list all of an object’s methods.
On opening the IDE, we need to select a Python Interpreter – Browse to the intalled Python.exe.
Code Examples:
I wanted to run through some examples with working with data. For the first, I read data from a file (using the results from the 2013 Atlanta Falcons season) to calculate the average scores.
The data file is at 2013 Falcons Results
from statistics import mean import csv filePath = "E:/2013FalconsResults.txt" falconsScores = [] opponentsScores = [] with open(filePath, "r") as f: reader = csv.DictReader(f, delimiter = "\t") help(reader) for row in reader: print(row["WeekNumber"] + " : " + row["Opponent"]) falconsScores.append(int(row["FalconsScore"])) opponentsScores.append(int(row["OpponentScore"])) print ("Falcons average score: " + str(mean(falconsScores))) print ("Opponents average score: " + str(mean(opponentsScores)))
For the 2nd example, we’ll connect to a database to get the same Falcons’ results and display them. The data file is available at 2013 Falcons Results, this can be imported into a ‘FalconsResults2013’ table in your database.
import pyodbc connectionString = 'DRIVER={SQL Server};SERVER=ServerName;DATABASE=DatabaseName;Trusted_Connection=yes;' db = pyodbc.connect(connectionString) cursor = db.cursor() cursor.execute ('select WeekNumber, GameDate, Opponent, Result, FalconsScore, OpponentScore, HomeGame from FalconsResults2013') rs = cursor.fetchall() for row in rs: print ('Week #' + str(row[0]) + ': Falcons ' + str(row[4]) + ' ' + row[2] + ' ' + str(row[5]))
Documentation:
Python Documentation:
https://www.python.org/doc/
Learn Python The Hard Way:
http://learnpythonthehardway.org/book/
Python in 10 minutes:
http://www.stavros.io/tutorials/python/