AGS to Pandas converter reference

class groundhog.general.agsconversion.AGSConverter(path, encoding='utf8', errors='replace', removedoublequotes=True, removeheadinglinebreaks=True, agsformat='4', **kwargs)[source]
__init__(path, encoding='utf8', errors='replace', removedoublequotes=True, removeheadinglinebreaks=True, agsformat='4', **kwargs)[source]

Initializes an AGS conversion object using the path to the AGS file. The AGS file needs to properly formatted with at least one blank line between each group. Each group should have four lines before the data starts:

  • A line with the group name;

  • A line with column headers;

  • A line with the units of the values in the columns;

  • A line with the data type of the columns

The functionality is developed for AGS4.x files but support for AGS3.1 files is also available using agsformat="3.1" as optional keyword argument.

Parameters:
  • path – Path to the AGS 4.0 file

  • encoding – Encoding of the file (default=utf-8)

  • errors – Specify file reading behaviour in case of encoding errors

  • removedoublequotes – Boolean determining whether doublequotes need to be removed after file loading (default=True)

  • removeheadinglinebreaks – Boolean determining whether line breaks in heading rows need to be removed after file loading (default=True)

  • agsformat – Format of the AGS file (default=``”4”). AGS 3.1 (“3.1”``) is also available

convert_ags_group(groupname, verbose_keys=False, additional_keys={}, use_shorthands=False, drop_heading_col=True, **kwargs)[source]

Isolate the data for a certain group and convert it to a Pandas dataframe.

Parameters:
  • groupname – Name of the group to be converted

  • verbose_keys – Boolean determining whether AGS code keys or their verbose equivalents are used. Conversion happens using the dictionaries in tables.py (default=False for AGS code keys)

  • additional_keys – Additional custom keys used in dataframe column name conversion

  • use_shorthands – Boolean determining whether shorthand codes should be used. If True, a first pass is done using these.

Returns:

Returns a dataframe with the requested data

static convert_ags_headers(df, agsformat)[source]

Converts the headers of an AGS-based dataframes from the three rows in the AGS to a single column header. Numerical data is also converted into the correct datatype. :param df: Dataframe with the group data :return: Dataframe with updated headers

create_dataframes(selectedgroups=None, verbose_keys=False, use_shorthands=False, drop_heading_col=True, **kwargs)[source]

Create a dictionary with Pandas dataframes for each groupname. The groups can be finetuned through the selectedgroups argument.

Parameters:
  • selectedgroups – List of groupnames to limit the conversion to (default is None leading to all groups being converted)

  • verbose_keys – Boolean determining whether AGS code keys or their verbose equivalents are used. Conversion happens using the dictionaries in tables.py (default=False for AGS code keys)

  • use_shorthands – Boolean determining whether shorthand codes should be used. If True, a first pass is done using these.

  • drop_heading_col – Boolean determining is the column HEADING [UNIT] should be dropped (default=True)

Returns:

Sets the data attribute for the AGSConverter object

extract_groupnames()[source]

Scans the AGS file and extracts all group names :return: Sets the attribute groupnames of the AGSConverter object

remove_doublequotes(replace_by='')[source]

Remove double quotes which are not preceded by a comma (“”) and replace by a the value defined in replace_by and a single quote. This is done because the read_csv function of Pandas will be used in a later stage and there can be errors when reading double quotes. Such expressions are common when coordinates in ° ‘ “ format are included in the ags file.

Parameters:

replace_by – String to replace the first quote of the double quote with

Returns:

remove_heading_linebreaks()[source]

Removes line breaks in header rows which would prevent further AGS parsing If a comma is followed by a line break (``,

``), it is replaced by a comma without line break