plots

Time Series Plotting and Date Handling Module.

This module provides functions for handling date-related operations on DataFrames and for visualizing time series data, including historical, forecast, and actual values. It supports both Matplotlib and Plotly as plotting engines, offering flexibility in visualization options.

Functions

  • convert_to_datetime: convert a column in a DataFrame to datetime format.

  • filter_by_date: filter a DataFrame by a start date.

  • plot_historical_and_forecast: plot historical data with optional forecast and actual values.

Notes

This module is designed to assist in the preparation and visualization of time series data. The plot_historical_and_forecast function is particularly useful for comparing historical data with forecasted and actual values, with options to highlight peaks and add custom plot elements using either Matplotlib or Plotly.

iowa_forecast.plots.convert_to_datetime(dataframe: DataFrame, col: str) DataFrame[source]

Convert a specified column in a DataFrame to datetime format.

This function takes a DataFrame and converts the specified column to pandas’ datetime format, enabling datetime operations on that column.

Parameters:
  • dataframe (pd.DataFrame) – The DataFrame containing the column to convert.

  • col (str) – The name of the column in the DataFrame to convert to datetime format.

Returns:

pd.DataFrame – The original DataFrame with the specified column converted to datetime format.

Return type:

DataFrame

Notes

You can also chain this function using pandas.DataFrame.pipe:

df = pd.DataFrame({
    'date': ['2023-01-01', '2023-01-02'],
    'value': [10, 15]
}).pipe(convert_to_datetime, 'date')

Examples

Convert the ‘date’ column in a DataFrame to datetime format:

>>> df = pd.DataFrame({
...     'date': ['2023-01-01', '2023-01-02'],
...     'value': [10, 15]
... })
>>> df = convert_to_datetime(df,'date')
>>> dataframe['date'].dtype
dtype('<M8[ns]')
iowa_forecast.plots.filter_by_date(dataframe: DataFrame, col: str, start_date: str)[source]

Filter a DataFrame by a start date.

Parameters:
  • dataframe (pd.DataFrame) – The DataFrame to filter.

  • col (str) – The name of the datetime column to filter by.

  • start_date (str) – The start date to filter the DataFrame. If None, no filtering is done.

Returns:

pd.DataFrame – The filtered DataFrame, or the original if no filtering is applied.

iowa_forecast.plots.plot_series(x_data, y_data, label: str, linestyle: str = '-', **kwargs) None[source]

Plot a series of data with optional markers.

This function plots a series of data using Matplotlib, with options to customize the line style, add markers, and change the marker color.

Parameters:
  • x_data (array-like) – The data for the x-axis.

  • y_data (array-like) – The data for the y-axis.

  • label (str) – The label for the plot legend.

  • linestyle (str, default "-") – The line style for the plot, e.g., ‘-’ for a solid line, ‘–’ for a dashed line.

  • **kwargs (dict, optional) –

    Additional keyword arguments for customizing the plot. Available options: - marker: str

    The marker style for scatter points.

    • color: str

      The color of the markers.

Returns:

None

Return type:

None

Examples

Plot a series of data with default settings:

>>> x = [1, 2, 3, 4]
>>> y = [10, 15, 10, 20]
>>> plot_series(x, y, label="Sample Data")

Plot a series with markers:

>>> plot_series(x, y, label="Sample Data", marker="o", color="red")
iowa_forecast.plots.plot_historical_and_forecast(input_timeseries: pd.DataFrame, timestamp_col_name: str, data_col_name: str, forecast_output: pd.DataFrame | None = None, forecast_col_names: dict | None = None, actual: pd.DataFrame | None = None, actual_col_names: dict | None = None, title: str | None = None, plot_start_date: str | None = None, show_peaks: bool = True, engine: str = 'matplotlib', **plot_kwargs) None[source]

Plot historical data with optional forecast and actual values.

This function visualizes time series data with options for forecasting, actual values, and peak highlighting. It supports both Matplotlib and Plotly as plotting engines.

Parameters:
  • input_timeseries (pd.DataFrame) – The DataFrame containing historical time series data.

  • timestamp_col_name (str) – The name of the column containing timestamps.

  • data_col_name (str) – The name of the column containing the data values.

  • forecast_output (pd.DataFrame, optional) – The DataFrame containing forecast data. Specify this parameter if you want to plot the historical or actual data and the forecasted values with lines of different colors.

  • forecast_col_names (dict, optional) – Dictionary mapping forecast DataFrame columns, by default None. Keys: ‘timestamp’, ‘value’, ‘confidence_level’, ‘lower_bound’, ‘upper_bound’.

  • actual (pd.DataFrame, optional) – The pandas.DataFrame containing actual data values. Specify this parameter if you want to compare forecasted values with their actual values.

  • actual_col_names (dict, optional) – Dictionary mapping actual DataFrame columns, by default None. Keys: ‘timestamp’, ‘value’.

  • title (str, optional) – The title of the plot. If no value is provided, then no title is added to the plot

  • plot_start_date (str, optional) – The start date for plotting data. If no value is provided, the plot uses all available dates in the plot.

  • show_peaks (bool, default True) – Whether to highlight peaks in the data.

  • engine (str {'matplotlib', 'plotly'}, default 'matplotlib') – The plotting engine to use, either ‘matplotlib’ or ‘plotly’. See the ‘Notes’ section for additional details.

  • **plot_kwargs – Additional keyword arguments for customization.

Raises:

ValueError – if the specified engine is neither ‘matplotlib’ nor ‘plotly’.

Return type:

None

Notes

The engine parameter allows you to specify which library gets used to generate the plots. Using 'engine='plotly' generates prettier plots. However, it requires you to have plotly library installed. By default, plotly is also included in the project requirements.txt file.

Examples

>>> df = pd.DataFrame({
...     "date": ["2023-01-01", "2023-01-02"],
...     "value": [10, 15]
... })
>>> plot_historical_and_forecast(df,"date","value",title="Sample Plot",engine="matplotlib")