class PC (gpucsl/pc/pc.py)

The abstract base class to inherit from to add a new data distribution.

init

def __init__( self, data: np.ndarray, max_level: int, alpha=0.05, kernels=None, is_debug: bool = False, should_log: bool = False )

Parameters

data: The data to analyze
data_distribution: Either DataDistribution.DISCRETE or DataDistribution.GAUSSIAN depending on the assumed distribution of the data
max_level: max level until which the pc algorithm will run (inclusive). Depending on the max level data structures will get allocated on the GPU, so you want to keep it small to avoid out of memory problems
alpha: Alpha value for the statistical tests
kernels: You can compile the kernels that should be used yourself and pass them to the function. Used for time measurements where the compile time should be excluded. Leave None and GPUCSL will compile the kernels for you
is_debug: If set to true kernels will get compiled in debug mode
should_log: Sets a macro 'LOG' while compiling the CUDA kernels. Can be used for custom logging from kernels

(abstract) discover_skeleton

Subclasses should implement the skeleton discovery for their respective distribution.

(abstract) set_distribution_specific_options

Subclasses should get their specific paramters here, validate them and save them as instance variables. As a convenience this method should return the current object so the execute method can get chained to it.

execute

execute()

Returns

The CPDAG that results from causal structure learning on your data, the separation sets, the maximum p values, and time measurements for the skeleton discovery and edge orientation of the pc algorithm, as well as time measurement for the execution of the kernels.

Return Value

PCResult

Executes the pc algorithm. Presuppose you run set_distribution_specific_options before!

class GaussianPC(PC) (gpucsl/pc/pc.py)

Concrete implementation of the PC algorithm for the multivariate normal data distribution.

set_distribution_specific_options

set_distribution_specific_options(self, devices: List[int] = [0], sync_device: int = None, correlation_matrix: np.ndarray = None)

Parameters

devices: Device IDs of GPUs to be used.
sync_device: Device ID of the GPU used for state synchronization in the multi GPU case (Notice: sync_device has to be in the devices list!) gaussian_correlation_matrix: A correlation matrix can be passed so time measurements do not inlcude the calculation. Only possible when using DataDistribution.GAUSSIAN. If None given GPUCSL calculates it itself.
correlation_matrix: The correlation matrix calculated from data

Returns

Itself for convenience

Return Value

self

class DiscretePC(PC) (gpucsl/pc/pc.py)

Concrete implementation of the PC algorithm for the discrete data distribution.

set_distribution_specific_options

set_distribution_specific_options(self, memory_restriction=None)

Parameters

memory_restriction: The maximum space to allocate for the working structures. Small values decrease the parallelisation. If None given defaults to 95% of the total available memory on GPU.

Returns

Itself for convenience

Return Value

self

discover_skeleton_gpu_gaussian (gpucsl/pc/discover_skeleton_gaussian.py)

discover_skeleton_gpu_gaussian(skeleton: np.ndarray, data: np.ndarray, correlation_matrix: np.ndarray, alpha: float, max_level: int, num_variables: int, num_observations: int, kernels: Kernels = None, is_debug: bool = False, should_log: bool = False, devices: List[int] = [0], sync_device: int = None,) -> SkeletonResult

Performs the skeleton discovery using a conditional independence test for a multivariate normal data distribution. Offers a multi GPU support. For that, provide an array with device IDs as the devices parameter (provided your system includes multiple GPUs)

Parameters

skeleton: A numpy array representing a fully connected graph with as many vertices as there are variables contained in data.
data: The original data
correlation_matrix: The correlation matrix calculated from data
alpha: Alpha value to do the statistical tests against
max_level: max level until which the pc algorithm will run. Depending on the max level data will get allocated on the GPU, so you want to keep it small to avoid out of memory problems
num_variables: The number of variables contained in data
num_observations: How many observations every of the variables has
kernels: You can compile the kernels that should be used yourself and pass them to the function. Used for time measurements where the compile time should be excluded. Leave None and GPUCSL will compile the kernels for you
is_debug: If set to true, kernels will get compiled in debug mode
should_log: Sets a macro 'LOG' while compiling the CUDA kernels. Can be used for custom logging from kernels
devices: Device IDs of GPUs to be used.
sync_device: Which of the given (sync_device has to be part of devices!) should be used to sync state while using multiple gpus

Returns

A SkeletonResult wrapped into TimedReturn. This represents the resulting undirected graph, helper structures as the separation sets (which are needed for the edge orientation), and execution time.

Return Value

SkeletonResult

discover_skeleton_gpu_discrete (gpucsl/pc/discover_skeleton_dicrete.py)

discover_skeleton_gpu_discrete( skeleton: np.ndarray, data: np.ndarray, alpha: float, max_level: int, num_variables: int, num_observations: int, kernels=None, memory_restriction: int = None, is_debug: bool = False, should_log: bool = False, ) -> SkeletonResult

Does the skeleton discovery using a conditional independence test for a discrete data distribution.

Parameters

skeleton: A numpy array representing a fully connected graph with as many vertices as there are variables contained in data.
data: The original data
alpha: Alpha value to do the statistical tests against
max_level: max level until which the pc algorithm will run. Depending on the max level data will get allocated on the GPU, so you want to keep it small to avoid out of memory problems
num_variables: The number of variables contained in data
num_observations: How many observations every of the variables has
kernels: You can compile the kernels that should be used yourself and pass them to the function. Used for time measurements where the compile time should be excluded. Leave None and GPUCSL will compile the kernels for you
memory_restriction: The maximum space to allocate for the working structures. Small values decrease the parallelisation. If None given defaults to 95% of the total available memory on GPU.
is_debug: If set to true kernels will get compiled in debug mode
should_log: Sets a macro 'LOG' while compiling the CUDA kernels. Can be used for custom logging from kernels

Returns

A SkeletonResult wrapped into TimedReturn. This represents the resulting undirected graph, helper structures as the separation sets (which are needed for the edge orientation), and execution time.

Return Value

SkeletonResult

orient_edges (gpucsl/pc/edge_orientation/edge_orientation.py)

orient_edges( skeleton: nx.Graph, separation_sets: np.ndarray, ) -> nx.DiGraph

Orients the edges of the skeleton.

Parameters

skeleton: The result of the skeleton discovery. The undirected graph to be directed.
separation_sets: A numpy array with the shape (num_variables, num_variables, max_level) representing the found separation sets. num_variables is the count of vertices of the skeleton and max_level is the same that was the input the skeleton discovery function that generated the separation sets.

Returns

The CPDAG that represents the causal dependencies

Return Value

networkx.DiGraph

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Public-api.md

Public-api.md

class PC (gpucsl/pc/pc.py)

init

Parameters

(abstract) discover_skeleton

(abstract) set_distribution_specific_options

execute

Returns

Return Value

class GaussianPC(PC) (gpucsl/pc/pc.py)

set_distribution_specific_options

Parameters

Returns

Return Value

class DiscretePC(PC) (gpucsl/pc/pc.py)

set_distribution_specific_options

Parameters

Returns

Return Value

discover_skeleton_gpu_gaussian (gpucsl/pc/discover_skeleton_gaussian.py)

Parameters

Returns

Return Value

discover_skeleton_gpu_discrete (gpucsl/pc/discover_skeleton_dicrete.py)

Parameters

Returns

Return Value

orient_edges (gpucsl/pc/edge_orientation/edge_orientation.py)

Parameters

Returns

Return Value

Files

Public-api.md

Latest commit

History

Public-api.md

File metadata and controls

class PC (gpucsl/pc/pc.py)

__init__

Parameters

(abstract) discover_skeleton

(abstract) set_distribution_specific_options

execute

Returns

Return Value

class GaussianPC(PC) (gpucsl/pc/pc.py)

set_distribution_specific_options

Parameters

Returns

Return Value

class DiscretePC(PC) (gpucsl/pc/pc.py)

set_distribution_specific_options

Parameters

Returns

Return Value

discover_skeleton_gpu_gaussian (gpucsl/pc/discover_skeleton_gaussian.py)

Parameters

Returns

Return Value

discover_skeleton_gpu_discrete (gpucsl/pc/discover_skeleton_dicrete.py)

Parameters

Returns

Return Value

orient_edges (gpucsl/pc/edge_orientation/edge_orientation.py)

Parameters

Returns

Return Value

init