sithom package

Submodules

sithom.curve module

Polynomial curve fitting.

sithom.curve.fit(x_npa, y_npa, reg_type='lin')

Fit a polynomial curve, with an estimate of the uncertainty.

Parameters:
  • x_npa (Sequence[Union[float, int]]) – The x values to fit.

  • y_npa (Sequence[Union[float, int]]) – The y values to fit.

  • reg_type (str, optional) – Which regression to do. Defaults to “lin”.

Returns:

Paramaters with uncertainty,

and function to put input data into.

Return type:

Tuple[unp.uarray, Callable]

Example of usage::
>>> import numpy as np
>>> from sithom.curve import fit
>>> param, func = fit(np.array([0, 1, 2]), np.array([1, 4, 7]))
>>> assert np.isclose(param[0].n, 3.0)
>>> assert np.isclose(param[1].n, 1.00)
sithom.curve.plot(x_values, y_values, reg_type='lin', x_label='x label', y_label='y label', ext=0.05, fig_path=None, ax_format='both')

Plot the polynomial.

Parameters:
  • x_values (Sequence[Union[float, int]]) – The x values to fit.

  • y_values (Sequence[Union[float, int]]) – The y values to fit.

  • reg_type (str, optional) – Which regression to do. Defaults to “lin”.

  • x_label (str, optional) – X label for plot. e.g. r”$Delta bar{T}_s$ over tropical pacific (pac) region [$Delta$ K]”

  • y_label (str) – Y labelsfor plot. e.g. r”$Delta bar{T}_s$ over nino3.4 region [$Delta$ K]”

  • ext (float) – how far in percentage terms to extend beyond data.

  • fig_path (Optional[str], optional) – Path to stor the figure in. Defaults to None.

  • ax_format (Literal["both", "x", "y"], optional) – which axes to format in scientific notation. Defaults to “both”.

Returns:

Paramaters with uncertainty,

function to put data into.

Return type:

Tuple[unp.uarray, Callable]

Example::
>>> from sithom.curve import plot
>>> param, func = plot(
...                    [-0.1, 0.5, 1.0, 1.5, 2.3, 2.9, 3.5],
...                    [-0.7, 0.1, 0.3, 1.1, 1.5, 2.3, 2.2]
...                   )
>>> "({:.3f}".format(param[0].n) + ", {:.3f})".format(param[0].s)
'(0.842, 0.078)'
>>> "({:.3f}".format(param[1].n) + ", {:.3f})".format(param[1].s)
'(-0.424, 0.161)'

sithom.misc module

Miscillanious Utilities Module.

sithom.misc.calculate_byte_size_recursively(obj, seen=None)

Recursively calculate size of objects in memory in bytes.

From: https://github.com/bosswissam/pysize. Meant as a helper function for get_byte_size.

Parameters:
  • obj (object) – The python object to get the size of

  • seen (set, optional) – This variable is needed to for the recusrive function evaluations, to ensure each object only gets counted once. Leave it at “None” to get the full byte size of an object. Defaults to None.

Returns:

The size of the object in bytes.

Return type:

int

sithom.misc.get_byte_size(obj)

Return human readable size of a python object in bytes.

Parameters:

obj (object) – The python object to analyse

Returns:

Human readable string with the size of the object

Return type:

str

Examples of using for numpy array::
>>> import numpy as np
>>> from sithom.misc import get_byte_size
>>> assert isinstance(get_byte_size(np.zeros(int(10e4))), str)
sithom.misc.human_readable_size(num, suffix='B')

Convert a number of bytes into human readable format.

This function is meant as a helper function for get_byte_size.

Parameters:
  • num (int) – The number of bytes to convert

  • suffix (str, optional) – The suffix to use for bytes. Defaults to ‘B’.

Returns:

A human readable version of the number of bytes.

Return type:

str

Example of using human readable sizes::
>>> from sithom.misc import human_readable_size
>>> assert human_readable_size(int(10e5)) == "977 KB"
>>> assert human_readable_size(int(10e13)) == "91 TB"
sithom.misc.in_notebook()

Check if in jupyter notebook.

Taken from this answer: https://stackoverflow.com/a/22424821

Returns:

whether in jupyter notebook.

Return type:

bool

Example of triggering from python terminal::
>>> from sithom.misc import in_notebook
>>> in_notebook()
False

sithom.place module

Place objects.

class sithom.place.BoundingBox(lon, lat, desc='No Description Given.')

Bases: object

BoundingBox class to deal with the varying output requirments often needed to describe the same geographical box to different APIs.

Example::
>>> from sithom.place import BoundingBox
>>> bbox = BoundingBox([-30, 30], [10, 30], desc="")
>>> bbox.cartopy() # [lon-, lon+, lat-, lat+]
[-30, 30, 10, 30]
>>> bbox.ecmwf() # [lat+, lon-, lat-, lon+]
[30, -30, 10, 30]
ax_label(ax)

Apply BoundingBox as labels to your graph.

Parameters:

ax (matplotlib.axes.Axes) – Axes to limit.

Return type:

None

ax_lim(ax)

Apply BoundingBox as ax limit to your graph.

Parameters:

ax (matplotlib.axes.Axes) – Axes to limit.

Return type:

None

cartopy()

Cartopy style bounding box.

Returns:

[lon-, lon+, lat-, lat+] # [degE, degE, degN, degN]

Return type:

List[float]

ecmwf()

ECMWF style bounding box.

Returns:

[lat+, lon-, lat-, lon+] # [degN, degE, degN, degE]

Return type:

List[float]

indices_inside(lons, lats)

Get indices of points inside the bounding box.

Currently only works for 1D arrays.

Parameters:
  • lons (np.ndarray) – Longitudes of points to check.

  • lats (np.ndarray) – Latitudes of points to check.

Returns:

Indices of points inside the bounding box.

Return type:

np.ndarray

pad(buffer=1)

Pad the BoundingBox by some number of degrees.

Parameters:

buffer (float, optional) – How many degrees East and North to go out from existing buffer. Defaults to 1.

Returns:

A bounding box that is padded by the buffer.

Return type:

BoundingBox

class sithom.place.Point(lon, lat, desc='No Description Given.')

Bases: object

bbox(buffer=3)

Get BoundingBox by padding around the Point by the buffer of some number of degrees.

Size of the square is 4 * buffer**2.

Parameters:

buffer (float, optional) – How many degrees East and North to go out from loc. Defaults to 3.

Returns:

A bounding box like [-91.0715, 28.9511, -89.0715, 30.9511].

Return type:

BoundingBox

Example::
>>> from sithom.place import Point
>>> point = Point(20, 30)
>>> bbox = point.bbox(2)
>>> bbox.cartopy() # [lon-, lon+, lat-, lat+]
[18, 22, 28, 32]
>>> bbox.ecmwf() # [lat+, lon-, lat-, lon+]
[32, 18, 28, 22]

sithom.plot module

Plotting Utilities Module.

Contains generic plotting functions that are used to achieve consistent and easy to produce plots across the project.

Example

Usage with simple plots:

from sithom.plot import (
    plot_defaults,
    label_subplots,
    get_dim,
    set_dim,
    PALETTE,
    STD_CLR_LIST,
    CAM_BLUE,
    BRICK_RED,
    OX_BLUE,
)

plot_defaults(use_tex=True)

# ---- example set of graphs ---

import numpy as np
import matplotlib.pyplot as plt

fig, axs = plt.subplots(2, 2)

x = np.linspace(0, np.pi, num=100)
axs[0, 0].plot(x, np.sin(x), color=STD_CLR_LIST[0])
axs[0, 1].plot(x, np.cos(x), color=STD_CLR_LIST[1])
axs[1, 0].plot(x, np.sinc(x), color=STD_CLR_LIST[2])
axs[1, 1].plot(x, np.abs(x), color=STD_CLR_LIST[3])

# set size
set_dim(fig, fraction_of_line_width=1, ratio=(5 ** 0.5 - 1) / 2)

# label subplots
label_subplots(axs, start_from=0, fontsize=10)
sithom.plot.axis_formatter()

Returns axis formatter for scientific notation. :rtype: ScalarFormatter

Returns:
matplotlib.ticker.ScalarFormatter: An object to pass in to a

matplotlib operation.

Examples

Using with xarray:

>>> import xarray as xr
>>> from sithom.plot import axis_formatter
>>> da = xr.tutorial.open_dataset("air_temperature").air
>>> quadmesh = da.isel(time=0).plot(cbar_kwargs={"format": axis_formatter()})
sithom.plot.cmap(variable_name)

Get cmap from a variable name string.

Ideally colormaps for variables should be consistent throughout the project, and changed in this function. The colormaps are set to be green where there are NaN values, as this has a high contrast with the colormaps used, and should ordinarily represent land, unless something has gone wrong.

Parameters:

variable_name (str) – name of variable to give colormap.

Returns:

sensible colormap

Return type:

matplotlib.colors.LinearSegmentedColormap

Example

Usage example for sea surface temperature:

>>> from sithom.plot import cmap
>>> cmap_t = cmap("sst")
>>> cmap_t = cmap("u")
>>> cmap_t = cmap("ranom")
sithom.plot.feature_grid(ds, fig_var, units, names, vlim, super_titles, figsize=(12, 6), label_size=12, supertitle_pos=(0.4, 1.3), xy=None)

Feature grid plot.

Parameters:
  • ds (xr.Dataset) – Input dataset with single timeslice of data on lon/lat grid.

  • fig_var (List[List[str]]) – Figure variable names.

  • units (List[List[str]]) – Units of variables.

  • names (List[List[str]]) – Names of variables to plot.

  • vlim (List[List[Tuple[float, float, str]]]) – Colorbar limits, and colorbar cmap.

  • super_titles (List[str]) – The titles for each column.

  • figsize (Tuple[float, float], optional) – Defaults to (12, 6).

  • label_size (int, optional) – Defaults to 12.

  • supertitle_pos (Tuple[float, float], optional) – Relative position for titles. Defaults to (0.4, 1.3).

  • xy (Optional[Tuple[Tuple[str, str, str], Tuple[str, str, str]]], optional) – coord name, display name, unit. Defaults to None.

Returns:

The figure and axes.

Return type:

Tuple[matplotlib.figure.Figure, np.ndarray]

sithom.plot.get_dim(width=398.3386, fraction_of_line_width=1, ratio=0.6180339887498949)

Return figure height, width in inches to avoid scaling in latex.

Default width is sithom.constants.REPORT_WIDTH. Default ratio is golden ratio, with figure occupying full page width.

Parameters:
  • width (float, optional) – Textwidth of the report to make fontsizes match. Defaults to sithom.constants.REPORT_WIDTH.

  • fraction_of_line_width (float, optional) – Fraction of the document width which you wish the figure to occupy. Defaults to 1.

  • ratio (float, optional) – Fraction of figure width that the figure height should be. Defaults to (5 ** 0.5 - 1)/2.

Returns:

Dimensions of figure in inches

Return type:

fig_dim (tuple)

Example

Here is an example of using this function:

>>> from sithom.plot import get_dim
>>> dim_tuple = get_dim(fraction_of_line_width=1, ratio=(5 ** 0.5 - 1) / 2)
>>> print("({:.2f},".format(dim_tuple[0]), "{:.2f})".format(dim_tuple[1]))
(5.51, 3.41)
sithom.plot.label_subplots(axs, labels=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'], start_from=0, fontsize=10, x_pos=0.02, y_pos=0.95, override=None)

Adds e.g. (a), (b), (c) at the top left of each subplot panel.

Labelling order achieved through ravelling the input list or np.array.

Parameters:
  • axs (Sequence[matplotlib.axes.Axes]) – list or np.array of matplotlib.axes.Axes.

  • labels (Sequence[str]) – A sequence of labels for the subplots.

  • start_from (int, optional) – skips first start_from labels. Defaults to 0.

  • fontsize (int, optional) – Font size for labels. Defaults to 10.

  • x_pos (float, optional) – Relative x position of labels. Defaults to 0.02.

  • y_pos (float, optional) – Relative y position of labels. Defaults to 0.95.

  • override (Optional[Literal["inside", "outside", "default"]], optional) – Choose a preset x_pos, y_pos option to overide choices. “Outside” is good for busy colormaps. Defaults to None.

Return type:

None

Returns:

void; alters the matplotlib.axes.Axes objects

Example

Here is an example of using this function:

>>> import matplotlib.pyplot as plt
>>> from sithom.plot import label_subplots
>>> fig, axs = plt.subplots(2, 2)
>>> label_subplots(axs, start_from=0, fontsize=10)
>>> fig, axs = plt.subplots(2, 2)
>>> label_subplots(axs, start_from=4, fontsize=10)
sithom.plot.lim(npa, percentile=5, balance=False)

Return colorbar limits.

Parameters:
  • npa (np.ndarray) – A numpy ndarray with values in, including nans.

  • percentile (float, optional) – Ignoring nans, use 5th and 95th percentile. Defaults to “5perc”.

  • balance (bool, optional) – Whether to balance limits around zero.

Returns:

(vmin, vmax)

Return type:

Tuple[float, float]

Example with a Gaussian distribution::
>>> import numpy as np
>>> from sithom.plot import lim
>>> samples = np.random.normal(size=(100, 100, 100, 10))
>>> vmin, vmax = lim(samples)
>>> print("({:.1f},".format(vmin), "{:.1f})".format(vmax))
(-1.6, 1.6)
>>> vmin, vmax = lim(samples, balance=True)
>>> print("({:.1f},".format(vmin), "{:.1f})".format(vmax))
(-1.6, 1.6)
sithom.plot.pairplot(df)

Improved seaborn pairplot from:

https://stackoverflow.com/a/50835066

Parameters:

df (pd.DataFrame) – A data frame.

Return type:

None

sithom.plot.plot_defaults(use_tex=None, dpi=None)

Apply plotting style to produce nice looking figures.

Call this at the start of a script which uses matplotlib. Can enable matplotlib LaTeX backend if it is available.

Uses serif font to fit into latex report.

Parameters:
  • use_tex (bool, optional) – Whether or not to use latex matplotlib backend. Defaults to False.

  • dpi (int, optional) – Which dpi to set for the figures. Defaults to 600 dpi (high quality) in terminal or 150 dpi for notebooks. Larger dpi may be needed for presentations.

Return type:

None

Examples

Basic setting for the plotting defaults:

>>> from sithom.plot import plot_defaults
>>> plot_defaults()
sithom.plot.set_dim(fig, width=398.3386, fraction_of_line_width=1, ratio=0.6180339887498949)

Set aesthetic figure dimensions to avoid scaling in latex.

Default width is sithom.constants.REPORT_WIDTH. Default ratio is golden ratio, with figure occupying full page width.

Parameters:
  • fig (matplotlib.figure.Figure) – Figure object to resize.

  • width (float) – Textwidth of the report to make fontsizes match. Defaults to sithom.constants.REPORT_WIDTH.

  • fraction_of_line_width (float, optional) – Fraction of the document width which you wish the figure to occupy. Defaults to 1.

  • ratio (float, optional) – Fraction of figure width that the figure height should be. Defaults to (5 ** 0.5 - 1)/2.

Return type:

None

Returns:

void; alters current figure to have the desired dimensions

Example

Here is an example of using this function:

>>> import matplotlib.pyplot as plt
>>> from sithom.plot import set_dim
>>> fig, ax = plt.subplots(1, 1)
>>> set_dim(fig, fraction_of_line_width=1, ratio=(5 ** 0.5 - 1) / 2)

sithom.time module

Time Utilities Module.

exception sithom.time.TimeoutException

Bases: Exception

The function has timed out, as the time limit has been reached.

sithom.time.hr_time(time_in)

Return human readable time as string.

I got fed up with converting the number in my head. Probably runs very quickly.

Parameters:

time (float) – time in seconds

Returns:

string to print.

Return type:

str

Example

120 seconds to human readable string:

>>> from sithom.time import hr_time
>>> hr_time(120)
    '02 min 00 s'
sithom.time.time_limit(seconds)

Time limit manager.

Function taken from:

https://stackoverflow.com/a/601168

Parameters:

seconds (int) – how many seconds to wait until timeout.

Return type:

None

Example

Call a function which will take longer than the time limit:

>>> import time
>>> from sithom.time import time_limit, TimeoutException
>>> def long_function_call():
...     for t in range(0, 5):
...         print("t=", t, "seconds")
...         time.sleep(1.1)
>>> try:
...     with time_limit(3):
...         long_function_call()
...         assert False
... except TimeoutException as e:
...     print("Timed out!")
... except Exception as e:
...     print("A different exception        ", e)
t= 0 seconds
t= 1 seconds
t= 2 seconds
Timed out!

Seems not to work for windows:

https://github.com/sdat2/sithom/runs/7343392694?check_suite_focus=true

sithom.time.time_stamp()

Return the current local time.

Returns:

Time string format “%Y-%m-%d %H:%M:%S”.

Return type:

str

sithom.time.timeit(method)

sithom.timeit is a wrapper for performance analysis.

It should return the time taken for a function to run. Alters log_time dict if fed in. Add @timeit to the function you want to time. Function needs **kwargs if you want it to be able to feed in log_time dict.

Parameters:

method (Callable) – the function that it takes as an input

Return type:

Callable

Examples

Here is an example with the tracking functionality and without:

from sithom.time import timeit
@timeit
def loop(**kwargs):
    total = 0
    for i in range(int(10e2)):
        for j in range(int(10e2)):
            total += 1
tmp_log_d = {}
loop(log_time=tmp_log_d)
print(tmp_log_d["loop"])
loop()

sithom.unc module

Uncertainties Utilities Module.

sithom.unc.tex_uf(ufloat_input, bracket=False, force_latex=False, exponential=True)

A function to take an uncertainties.ufloat, and return a tex containing string for plotting, which has the right number of decimal places.

Parameters:
  • ufloat_input (ufloat) – The uncertainties ufloat object.

  • bracket (bool, optional) – Whether or not to add latex brackets around the parameter. Defaults to False.

  • force_latex (bool, optional) – Whether to force latex output. Defaults to False. If false will check matplotlib.rcParams first.

  • exponential (bool, optional) – Whether to put in scientific notation. Defaults to True.

Returns:

String ready to be added to a graph label.

Return type:

str

Example usage::
>>> from uncertainties import ufloat
>>> from sithom.unc import tex_uf
>>> uf = ufloat(1, 0.5)
>>> tex_uf(uf, bracket=True, force_latex=True)
    '$\\left( 1.0 \\pm 0.5 \\right)$'
>>> uf = ufloat(10, 5)
>>> tex_uf(uf, bracket=True, force_latex=True)
    '$\\left( \\left(1.0 \\pm 0.5\\right) \\times 10^{1} \\right)$'
>>> tex_uf(ufloat(0.0, 0.0), bracket=True, force_latex=True) # works weirdly
    '$\\left( 0.0 \\pm 0 \\right)$'
>>> tex_uf(ufloat(2, 0.06), bracket=True, force_latex=True)
    '$\\left( 2.0 \\pm 0.1 \\right)$'

(Had to add twice as many backslashes for pytest to run.)

TODO: Fix connsistency to get bullet points right.

sithom.xr module

Xarray utilities module.

sithom.xr.mon_increase(xr_obj, x_dim='longitude', y_dim='latitude')

Make sure that an xarray axes has monotonically increasing values.

Parameters:
  • xr_obj (Union[xr.Dataset, xr.DataArray]) –

  • x_dim (str, optional) – x dimension name. Defaults to “longitude”.

  • y_dim (str, optional) – y dimension name. Defaults to “latitude”.

Returns:

xarray object.

Return type:

Union[xr.Dataset, xr.DataArray]

Examples::
>>> import xarray as xr
>>> da = xr.tutorial.open_dataset("air_temperature").air
>>> improved_da = mon_increase(da, x_dim="lon", y_dim="lat")
sithom.xr.plot_units(xr_obj, x_dim='longitude', y_dim='latitude')

Adding good latex units to make the xarray object plottable.

Xarray uses “long_name” and “units” attributes for plotting.

Fails softly.

Parameters:
  • xr_da (Union[xr.DataArray, xr.Dataset]) – Initial datarray/dataset (potentially with units for axes).

  • x_dim (str) – Defaults to “longitude”.

  • y_dim (str) – Defaults to “latitude”.

Returns:

Datarray/Dataset with correct

units/names for plotting. Assuming that you’ve given the correct x_dim and y_dim for the object.

Return type:

Union[xr.DataArray, xr.Dataset]

Examples of using it::
>>> import xarray as xr
>>> from sithom.xr import plot_units
>>> da = plot_units(xr.tutorial.open_dataset("air_temperature").air)
>>> da.attrs["units"]
'K'
sithom.xr.spatial_mean(dataarray, x_dim='longitude', y_dim='latitude')

Average a datarray over “longitude” and “latitude” coordinates.

Spatially weighted.

Originally from: https://ncar.github.io/PySpark4Climate/tutorials/Oceanic-Ni%C3%B1o-Index/ (although their version is wrong as it assumes numpy input is degrees)

https://numpy.org/doc/stable/reference/generated/numpy.cos.html https://numpy.org/doc/stable/reference/generated/numpy.radians.html

The average should behave like:

\begin{equation} \bar{T}_{\text{Lat }}=\frac{1}{\text{nLon }} \sum_{i=1}^{\text{nLon}} T_{ \text{Lon}, \; i} \end{equation} \begin{equation} \bar{T}_{\text{ month }}= \frac{ \sum_{j=1}^{\text{nLat} } \cos \left(\text{ Lat }_{j} \right) \bar{T}_{\text{ Lat }, \; j} } { \sum_{j=1}^{\text{nLat} } \cos \left( \text{ Lat }_{j} \right) } \end{equation}
Parameters:
  • da (xr.DataArray) – da to average.

  • x_dim (str) – The longitude dimension name. Defaults to “longitude”.

  • y_dim (str) – The latitude dimension name. Defaults to “latitude”.

Returns:

Avarage of da.

Return type:

xr.DataArray

Example of calculating and plotting mean timeseries of dataarray:

>>> import xarray as xr
>>> from sithom.xr import spatial_mean
>>> da = xr.tutorial.open_dataset("air_temperature").air
>>> timeseries_mean = spatial_mean(da, x_dim="lon", y_dim="lat")

timeseries_mean.plot.line()

Module contents

Sithom module full of utilities.