Python data science cheatsheet

External Cheatsheets:

Pandas cheatsheet

Numpy

Numpy for matlab users
Numpy matrixes are displayed as a list of row vectors.
- An identity matrix is therefore shown as [[1, 0], [0, 1]] The first axis is the row number, then the column number.
- axis=0; perform operations along the column, provides a row vector
- axis=1; perform operations along the row, provides a column vector (though it is still just a list)

Initializing a matrix

# initialzing a matrix
a = np.zeros(shape=(3, 2))
a = np.empty(shape=(3, 2)) #initializes with random values
a = np.full(shape=(3,2), np.nan)

Boolean array or matrix

  a = np.full((m,n), True)
  b = np.full((m,n), False)

string array with numpy¹

my_array = numpy.empty([1, 2], dtype="S10")
# S10 -> 10 characters
# Add S_num to preallocate number of strings

np.newaxis²
Numpy Array to list
```
np.array.tolist()
```

max and max index

max_val = np.max(matrix)
max_ind = np.argmax(matrix)

Pandas

Data import

data = pd.read_csv(os.path.join(root_folder, filename), sep='\t', skiprows=1, header=None, names=['col1','col2'])
 
# to reset indexes (and drop it)
data.reset_index(drop=True, inplace=True)

Iteration³

# iterating through rows (NOT RECOMMENDED)
for i, row in df.iterrows():
	print(row)
# iterating through index (probably not recommended)
for i in df.index:
	print(i)

Nested dictionary to dataframe (from_dict)

d = dict{1:{'a':0, 'b':1}, 2:{'a': 1, 'b':10}}
df = pd.DataFrame.from_dict(d ,orient='index')

Adding (extra) named indexes (rows) and sorting column based on named indexes

df.loc["mean"] = df.mean()
df = df.sort_values("mean", axis=1, ascending=False)

Scipy

Interpolation

Integration

Simpson’s rule: https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.simpson.html

JAX

PyMC

Intro to PyMC

Setup

conda install numpy pandas scipy scikit-learn matplotlib seaborn dill
# optional
conda install ipykernel ipywidgets # for vscode + jupyter

💭 DN's Umwelt

Table of Contents

Graph View

Explorer

Python data science cheatsheet

Table of Contents

Numpy

Pandas

Scipy

Interpolation

Integration

JAX

PyMC

Setup

Graph View

Backlinks

Recent Notes

Agriculture needs another revolution

Why we remember (and how to remember better)

Wellbeing is a skill: Perspectives from contemplative neuroscience

Sage seminar series

Recent Notes

Agriculture needs another revolution

Why we remember (and how to remember better)

Wellbeing is a skill: Perspectives from contemplative neuroscience

Sage seminar series

Pest detection using plant vibrations

Table of Contents

Graph View

Explorer

Python data science cheatsheet

Table of Contents

Numpy

Pandas

Scipy

Interpolation

Integration

JAX

PyMC

Setup

Footnotes

Graph View

Backlinks

Recent Notes

Recent Notes