Modules and Packages

Jupyter notebooks are great for some things, but they aren’t so great if we want to define a large library of functions to use.

You have probably noticed that we are able to import functions to use from numpy using the import keyword

import numpy
import numpy as np

You can do this with your own code by writing your own Python modules and packages.

Modules

A Python module is simply a .py file which contains definitions we would like to use. For instance, we can import the mymod module defined in mymod.py.

import mymod
mymod.plus1(1)
2

By default, the contents of the module will be imported into a namespace with the same name as the file (dropping the .py extension). We can re-name the namespace using the as keyword

import mymod as mm
mm.plus1(1)
2

We can also import specific functions or classes from a module using the from keyword

from mymod import plus1, myclass
print(myclass(1,2))
print(plus1(1))
<mymod.myclass object at 0x7f2c606dfb20>
2

If we want to import all functions:

from mymod import *
plus1(1)
2

you can see what is available in the mymod namespace using dir

dir(mymod)
['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'myclass',
 'plus1']

Modules vs. Scripts

What is the difference between a module and a script? Both are plain text files with a *.py extension.

Conceptually, a module contains functionality (e.g. functions, classes) that we would like to use and re-use, without needing to redefine in our interpreter every time. Typically, we use modules with an import statement. A script contains code that does something. This might include defining classes and functions, but usually also includes some sort of task such as analyzing data in a *.csv file, which it wouldn’t make sense to do in an import statement.

There isn’t really a clear cut line between modules and scripts - you might run a *.py file that was written for a module as a script, or import a *.py file that was written as a script.

If you want to have a file that can be used as both a module and a script, you can put the “scripting” part of the file in a block that begins with

if __name__ == '__main__':
    # scripting code here

An example can be found in mymod.py

__name__ is a global variable, which is set to __main__ when the file is executed from the command line, as in

python mymod.py

If the file is imported as a module, you see that __name__ is set to be the name of the file

mymod.__name__
'mymod'
import mymod as mm
mm.__name__
'mymod'

Packages

Python packages are contained in a directory with an __init__.py file. These directories may contain muliple *.py files, as well as nested directories with __init__.py files (sub-packages). These directories might also contain other sorts of files such as documentation files or shared object libraries.

You can use the __init__.py file to automate import of modules inside a package, or modify namespaces.

An example can be found in the mypack directory.

import mypack
dir(mypack)
['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 'functions',
 'hello',
 'more',
 'mypack']

The following function is imported from mypack/mypack.py in __init__.py:

mypack.hello()
hello from mypack

cube is in mypack/functions.py

mypack.functions.cube(3)
27

we can import a submodule from a package into a new namespace:

import mypack.functions as fun1
fun1.__name__
'mypack.functions'

square is also in mypack/functions.py

fun1.square(2)
4

We can also have subpackages in packages (subdirectories of our directory).

plus2 is defined in mypack/subpackage/functions.py

import mypack.subpackage.functions as fun2
fun2.plus2(3)
5

Conventions

By convention, Python packages and modules should have short lowercase names. See here for reference.