| | |
- builtins.object
-
- Data
- collections.UserDict(collections.abc.MutableMapping)
-
- LazyDataset
class Data(builtins.object) |
| |
Data(values: Union[numpy.ndarray, pandas.core.frame.DataFrame], variables: List = None, variable_types: Dict = None)
|
| |
Methods defined here:
- __copy__(self)
- Create a copy of the Data object.
Returns:
Data: A copy of the Data object.
- __getitem__(self, *args)
- Retrieve data for the specified key(s).
Parameters:
*args: Key(s) to retrieve data for.
Returns:
Any: Data corresponding to the specified key(s).
- __init__(self, values: Union[numpy.ndarray, pandas.core.frame.DataFrame], variables: List = None, variable_types: Dict = None)
- Initialize the Data object with dataset values, variable names, and types.
Parameters:
values (Union[np.ndarray, pd.DataFrame]): Dataset values.
variables (List, optional): List of variable names. Required if values is a numpy array.
variable_types (Dict, optional): Dictionary mapping variable names to their types.
Raises:
Exception: If variable names are not supplied when values is a numpy array.
- __len__(self)
- Get the number of rows in the dataset.
Returns:
int: Number of rows in the dataset.
- k_fold(self, k=5, shuffle=True, seed=None)
- Perform k-fold splitting of the dataset.
Parameters:
k (int): Number of folds. Default is 5.
shuffle (bool): Whether to shuffle the data before splitting. Default is True.
seed (int, optional): Random seed for reproducibility.
Yields:
Tuple[Data, Data]: Training and testing datasets for each fold.
- min_max_scale(self, variables: List = None)
- Scale the specified variables in the dataset to a range of [0, 1].
Parameters:
variables (List, optional): List of variable names to scale. Defaults to all continuous variables.
Returns:
Data: A new Data object with scaled variables.
- normalise(self, variables: List = None)
- Normalize the specified variables in the dataset.
Parameters:
variables (List, optional): List of variable names to normalize. Defaults to all continuous variables.
Returns:
Data: A new Data object with normalized variables.
Readonly properties defined here:
- columns
- Get the list of variable names.
Returns:
List: List of variable names.
- shape
- Get the shape of the dataset.
Returns:
Tuple[int, int]: Shape of the dataset (rows, columns).
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
Data and other attributes defined here:
- BINARY_TYPE = 'binary'
- CATEGORICAL_TYPE = 'categorical'
- CONTINUOUS_TYPE = 'continuous'
- ORDINAL_TYPE = 'ordinal'
|
class LazyDataset(collections.UserDict) |
| |
LazyDataset(dict=None, /, **kwargs)
|
| |
- Method resolution order:
- LazyDataset
- collections.UserDict
- collections.abc.MutableMapping
- collections.abc.Mapping
- collections.abc.Collection
- collections.abc.Sized
- collections.abc.Iterable
- collections.abc.Container
- builtins.object
Methods defined here:
- __getitem__(self, key: Any) -> Any
- Retrieve the value associated with the given key. If the value is a file path, it will be loaded as a pandas DataFrame.
Parameters:
key (Any): Key to retrieve the value for.
Returns:
Any: The value associated with the key, or a pandas DataFrame if the value is a file path.
- __setitem__(self, key: Any, item: Any) -> None
Data and other attributes defined here:
- __abstractmethods__ = frozenset()
Methods inherited from collections.UserDict:
- __contains__(self, key)
- # Modify __contains__ and get() to work like dict
# does when __missing__ is present.
- __copy__(self)
- __delitem__(self, key)
- __init__(self, dict=None, /, **kwargs)
- Initialize self. See help(type(self)) for accurate signature.
- __ior__(self, other)
- __iter__(self)
- __len__(self)
- __or__(self, other)
- Return self|value.
- __repr__(self)
- Return repr(self).
- __ror__(self, other)
- Return value|self.
- copy(self)
- get(self, key, default=None)
- D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.
Class methods inherited from collections.UserDict:
- fromkeys(iterable, value=None)
Data descriptors inherited from collections.UserDict:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
Methods inherited from collections.abc.MutableMapping:
- clear(self)
- D.clear() -> None. Remove all items from D.
- pop(self, key, default=<object object at 0x102c881c0>)
- D.pop(k[,d]) -> v, remove specified key and return the corresponding value.
If key is not found, d is returned if given, otherwise KeyError is raised.
- popitem(self)
- D.popitem() -> (k, v), remove and return some (key, value) pair
as a 2-tuple; but raise KeyError if D is empty.
- setdefault(self, key, default=None)
- D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D
- update(self, other=(), /, **kwds)
- D.update([E, ]**F) -> None. Update D from mapping/iterable E and F.
If E present and has a .keys() method, does: for k in E: D[k] = E[k]
If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v
In either case, this is followed by: for k, v in F.items(): D[k] = v
Methods inherited from collections.abc.Mapping:
- __eq__(self, other)
- Return self==value.
- items(self)
- D.items() -> a set-like object providing a view on D's items
- keys(self)
- D.keys() -> a set-like object providing a view on D's keys
- values(self)
- D.values() -> an object providing a view on D's values
Data and other attributes inherited from collections.abc.Mapping:
- __hash__ = None
- __reversed__ = None
Class methods inherited from collections.abc.Collection:
- __subclasshook__(C)
- Abstract classes can override this to customize issubclass().
This is invoked early on by abc.ABCMeta.__subclasscheck__().
It should return True, False or NotImplemented. If it returns
NotImplemented, the normal algorithm is used. Otherwise, it
overrides the normal algorithm (and the outcome is cached).
Class methods inherited from collections.abc.Iterable:
- __class_getitem__ = GenericAlias(...)
- Represent a PEP 585 generic type
E.g. for t = list[int], t.__origin__ is list and t.__args__ is (int,).
| |