torch_geometric_signed_directed.data.signed.SignedData
======================================================

.. py:module:: torch_geometric_signed_directed.data.signed.SignedData


Classes
-------

.. autoapisummary::

   torch_geometric_signed_directed.data.signed.SignedData.SignedData


Functions
---------

.. autoapisummary::

   torch_geometric_signed_directed.data.signed.SignedData.sqrtinvdiag


Module Contents
---------------

.. py:function:: sqrtinvdiag(M: scipy.sparse.spmatrix) -> scipy.sparse.csc_matrix

   Inverts and square-roots a positive diagonal matrix.

   :Parameters: **M** (*scipy sparse matrix*) -- matrix to invert

   :returns: scipy sparse matrix of inverted square-root of diagonal


.. py:class:: SignedData(x: torch_geometric.typing.OptTensor = None, edge_index: torch_geometric.typing.OptTensor = None, edge_attr: torch_geometric.typing.OptTensor = None, edge_weight: torch_geometric.typing.OptTensor = None, y: torch_geometric.typing.OptTensor = None, pos: torch_geometric.typing.OptTensor = None, A: torch_geometric.typing.Union[torch_geometric.typing.Tuple[scipy.sparse.spmatrix, scipy.sparse.spmatrix], scipy.sparse.spmatrix, None] = None, init_data: Optional[torch_geometric.data.Data] = None, **kwargs)

   Bases: :py:obj:`torch_geometric.data.Data`


   A data object describing a homogeneous signed graph.

   :Parameters: * **x** (*Tensor, optional*) -- Node feature matrix with shape :obj:`[num_nodes,
                  num_node_features]`. (default: :obj:`None`)
                * **edge_index** (*LongTensor, optional*) -- Graph connectivity in COO format
                  with shape :obj:`[2, num_edges]`. (default: :obj:`None`)
                * **edge_attr** (*Tensor, optional*) -- Edge feature matrix with shape
                  :obj:`[num_edges, num_edge_features]`. (default: :obj:`None`)
                * **edge_weight** (*Tensor, optional*) -- Edge weights with shape
                  :obj:`[num_edges,]`. (default: :obj:`None`)
                * **y** (*Tensor, optional*) -- Graph-level or node-level ground-truth labels
                  with arbitrary shape. (default: :obj:`None`)
                * **pos** (*Tensor, optional*) -- Node position matrix with shape
                  :obj:`[num_nodes, num_dimensions]`. (default: :obj:`None`)
                * **A** (*sp.spmatrix or a tuple of sp.spmatrix, optional*) -- SciPy sparse adjacency matrix,
                  or a tuple of the positive and negative parts. (default: :obj:`None`)
                * **init_data** (*Data, optional*) -- Initial data object, whose attributes will be inherited. (default: :obj:`None`)
                * **\*\*kwargs** (*optional*) -- Additional attributes.


   .. py:attribute:: A


   .. py:attribute:: edge_weight


   .. py:attribute:: edge_index


   .. py:attribute:: num_nodes


   .. py:method:: separate_positive_negative()


   .. py:method:: clear_separate_attributes()


   .. py:property:: is_signed
      :type: bool


   .. py:property:: is_directed
      :type: bool


   .. py:property:: is_weighted
      :type: bool


   .. py:method:: to_unweighted()


   .. py:method:: set_signed_Laplacian_features(k: int = 2)

      generate the graph features using eigenvectors of the signed Laplacian matrix.

      :Parameters: **k** (*int*) -- The dimension of the features. Default is 2.


   .. py:method:: set_spectral_adjacency_reg_features(k: int = 2, normalization: Optional[int] = None, tau_p=None, tau_n=None, eigens=None, mi=None)

      generate the graph features using eigenvectors of the regularised adjacency matrix.

      :Parameters: * **k** (*int*) -- The dimension of the features. Default is 2.
                   * **normalization** (*string*) -- How to normalise for cluster size:

                     1. :obj:`none`: No normalization.

                     2. :obj:`"sym"`: Symmetric normalization
                     :math:`\mathbf{A} <- \mathbf{D}^{-1/2} \mathbf{A}
                     \mathbf{D}^{-1/2}`

                     3. :obj:`"rw"`: Random-walk normalization
                     :math:`\mathbf{A} <- \mathbf{D}^{-1} \mathbf{A}`

                     4. :obj:`"sym_sep"`: Symmetric normalization for the positive and negative parts separately.

                     5. :obj:`"rw_sep"`: Random-walk normalization for the positive and negative parts separately.
                   * **tau_p** (*int*) -- Regularisation coefficient for positive adjacency matrix.
                   * **tau_n** (*int*) -- Regularisation coefficient for negative adjacency matrix.
                   * **eigens** (*int*) -- The number of eigenvectors to take. Defaults to k.
                   * **mi** (*int*) -- The maximum number of iterations for which to run eigenvlue solvers. Defaults to number of nodes.


   .. py:method:: inherit_attributes(data: torch_geometric.data.Data)


   .. py:method:: node_split(train_size: torch_geometric.typing.Union[int, float] = None, val_size: torch_geometric.typing.Union[int, float] = None, test_size: torch_geometric.typing.Union[int, float] = None, seed_size: torch_geometric.typing.Union[int, float] = None, train_size_per_class: torch_geometric.typing.Union[int, float] = None, val_size_per_class: torch_geometric.typing.Union[int, float] = None, test_size_per_class: torch_geometric.typing.Union[int, float] = None, seed_size_per_class: torch_geometric.typing.Union[int, float] = None, seed: List[int] = [], data_split: int = 2)

      Train/Val/Test/Seed split for node classification tasks.
      The size parameters can either be int or float.
      If a size parameter is int, then this means the actual number, if it is float, then this means a ratio.
      ``train_size`` or ``train_size_per_class`` is mandatory, with the former regardless of class labels.
      Validation and seed masks are optional. Seed masks here masks nodes within the training set, e.g., in a semi-supervised setting as described in the
      `SSSNET: Semi-Supervised Signed Network Clustering <https://arxiv.org/pdf/2110.06623.pdf>`_ paper.
      If test_size and test_size_per_class are both None, all the remaining nodes after selecting training (and validation) nodes will be included.

      :Parameters: * **data** (*torch_geometric.data.Data or DirectedData, required*) -- The data object for data split.
                   * **train_size** (*int or float, optional*) -- The size of random splits for the training dataset. If the input is a float number, the ratio of nodes in each class will be sampled.
                   * **val_size** (*int or float, optional*) -- The size of random splits for the validation dataset. If the input is a float number, the ratio of nodes in each class will be sampled.
                   * **test_size** (*int or float, optional*) -- The size of random splits for the validation dataset. If the input is a float number, the ratio of nodes in each class will be sampled.
                     (Default: None. All nodes not selected for training/validation are used for testing)
                   * **seed_size** (*int or float, optional*) -- The size of random splits for the seed nodes within the training set. If the input is a float number, the ratio of nodes in each class will be sampled.
                   * **train_size_per_class** (*int or float, optional*) -- The size per class of random splits for the training dataset. If the input is a float number, the ratio of nodes in each class will be sampled.
                   * **val_size_per_class** (*int or float, optional*) -- The size per class of random splits for the validation dataset. If the input is a float number, the ratio of nodes in each class will be sampled.
                   * **test_size_per_class** (*int or float, optional*) -- The size per class of random splits for the testing dataset. If the input is a float number, the ratio of nodes in each class will be sampled.
                     (Default: None. All nodes not selected for training/validation are used for testing)
                   * **seed_size_per_class** (*int or float, optional*) -- The size per class of random splits for seed nodes within the training set. If the input is a float number, the ratio of nodes in each class will be sampled.
                   * **seed** (*An empty list or a list with the length of data_split, optional*) -- The random seed list for each data split.
                   * **data_split** (*int, optional*) -- number of splits (Default : 2)


   .. py:method:: link_split(size: int = None, splits: int = 2, prob_test: float = 0.15, prob_val: float = 0.05, task: str = 'sign', seed: int = 0, ratio: float = 1.0, maintain_connect: bool = False, device: str = 'cpu') -> dict

      Get train/val/test dataset for the link sign prediction task.

      Arg types:
          * **data** (torch_geometric.data.Data or DirectedData object) - The input dataset.
          * **prob_val** (float, optional) - The proportion of edges selected for validation (Default: 0.05).
          * **prob_test** (float, optional) - The proportion of edges selected for testing (Default: 0.15).
          * **splits** (int, optional) - The split size (Default: 10).
          * **size** (int, optional) - The size of the input graph. If none, the graph size is the maximum index of nodes plus 1 (Default: None).
          * **task** (str, optional) - The evaluation task: four_class_signed_digraph (four-class sign and direction prediction); five_class_signed_digraph (five-class sign, direction and existence prediction); sign (link sign prediction). (Default: 'sign')
          * **seed** (int, optional) - The random seed for positve edge selection (Default: 0). Negative edges are selected by pytorch geometric negative_sampling.
          * **maintain_connect** (bool, optional) - If maintaining connectivity when removing edges for validation and testing. The connectivity is maintained by obtaining edges in the minimum spanning tree/forest first. These edges will not be removed for validation and testing. (Default: False).
          * **ratio** (float, optional) - The maximum ratio of edges used for dataset generation. (Default: 1.0)
          * **device** (int, optional) - The device to hold the return value (Default: 'cpu').

      Return types:
          * **datasets** - A dict include training/validation/testing splits of edges and labels. For split index i:

              1. datasets[i]['graph'] (torch.LongTensor): the observed edge list after removing edges for validation and testing.

              2. datasets[i]['train'/'val'/'testing']['edges'] (List): the edge list for training/validation/testing.

              3. datasets[i]['train'/'val'/'testing']['label'] (List): the labels of edges:

                  * If task == "four_class_signed_digraph": 0 (the positive directed edge exists in the graph),
                      1 (the negative directed edge exists in the graph), 2 (the positive edge of the reversed direction exists),
                      3 (the edge of the reversed direction exists).
                      The undirected edges in the directed input graph are removed to avoid ambiguity.

                  * If task == "five_class_signed_digraph": 0 (the positive directed edge exists in the graph),
                      1 (the negative directed edge exists in the graph), 2 (the positive edge of the reversed direction exists),
                      3 (the edge of the reversed direction exists), 4 (the edge doesn't exist in both directions).
                      The undirected edges in the directed input graph are removed to avoid ambiguity.

                  * If task == "sign": 0 (negative edge), 1 (positive edge).