Network interaction 2d

Performs Two dimensional MSA as explain in section 2.3 of [1] for every possible pair of elements and returns a symmetric matrix of interactions between the elements.

Args:
    n_permutations (int): Number of permutations (samplescontributions_excluding) per element.

    elements (list): List of the players (elements). Can be strings (names), integers (indicies), and tuples.

    pairs (Optional[list]): List of pairs of elements that you want to analyze the interaction between. 
        Defaults to None which means all possible pairs

    objective_function (Callable):
        The game (in-silico experiment). It should get the complement set and return one numeric value
        either int or float.
        This function is just calling it as: objective_function(complement, **objective_function_params)

        An example using networkx with some tips:
        (you sometimes need to specify what should happen during edge-cases like an all-lesioned network)

        def local_efficiency(complements, graph):
            if len(complements) < 0:
                # the network is intact so:
                return nx.local_efficiency(graph)

            elif len(complements) == len(graph):
                # the network is fully lesioned so:
                return 0.0

            else:
                # lesion the system, calculate things
                lesioned = graph.copy()
                lesioned.remove_nodes_from(complements)
                return nx.local_efficiency(lesioned)

contributions_excluding objective_function_params (Dict, optional): Kwargs for the objective_function. Defaults to {}.

    multiprocessing_method (str, optional): 
        So far, two methods of parallelization is implemented, 'joblib' and 'ray' and the default method is joblib.
        If using ray tho, you need to decorate your objective function with @ray.remote decorator. Visit their
        documentations to see how to go for it. I guess ray works better on HPC clusters (if they support it tho!)
        and probably doesn't suffer from the sneaky "memory leakage" of joblib. But just by playing around,
        I realized joblib is faster for tasks that are small themselves. Remedies are here:
        https://docs.ray.io/en/latest/auto_examples/tips-for-first-time.html

        Note: Generally, multiprocessing isn't always faster as explained above. Use it when the function itself
        takes some like each game takes longer than 0.5 seconds or so. For example, a function that sleeps for a
        second on a set of 10 elements with 1000 permutations each (1024 games) performs as follows:

            - no parallel: 1020 seccontributions_excluding
            - joblib: 63 sec
            - ray: 65 sec

        That makes sense since I have 16 cores and 1000/16 is around 62. 
        Defaults to 'joblib'.

    rng (Optional[np.random.Generator], optional): Numpy random generator object used for reproducable results. Default is None. Defaults to None.

    random_seed (Optional[int], optional): 
        sets the random seed of the sampling process. Only used when `rng` is None. Default is None. Defaults to None.

    n_parallel_games (int):
        Number of parallel jobs (number of to-be-occupied cores),
        -1 means all CPU cores and 1 means a serial process.
        I suggest using 1 for debugging since things get messy in parallel!


Raises:
    NotImplementedError: Raises this error in case the contribution is a timeseries or there are
        multiple contributions

Returns:
    np.ndarray: the interaction matrix

Source code in msapy/msa.py

@typechecked
def network_interaction_2d(*,
                           n_permutations: int,
                           elements: list,
                           pairs: Optional[list] = None,
                           objective_function: Callable,
                           objective_function_params: Dict = {},
                           multiprocessing_method: str = 'joblib',
                           rng: Optional[np.random.Generator] = None,
                           random_seed: Optional[int] = None,
                           n_parallel_games: int = -1,
                           lazy: bool = False
                           ) -> np.ndarray:
    """Performs Two dimensional MSA as explain in section 2.3 of [1]
    for every possible pair of elements and returns a symmetric matrix of
    interactions between the elements.

    Args:
        n_permutations (int): Number of permutations (samplescontributions_excluding) per element.

        elements (list): List of the players (elements). Can be strings (names), integers (indicies), and tuples.

        pairs (Optional[list]): List of pairs of elements that you want to analyze the interaction between. 
            Defaults to None which means all possible pairs

        objective_function (Callable):
            The game (in-silico experiment). It should get the complement set and return one numeric value
            either int or float.
            This function is just calling it as: objective_function(complement, **objective_function_params)

            An example using networkx with some tips:
            (you sometimes need to specify what should happen during edge-cases like an all-lesioned network)

            def local_efficiency(complements, graph):
                if len(complements) < 0:
                    # the network is intact so:
                    return nx.local_efficiency(graph)

                elif len(complements) == len(graph):
                    # the network is fully lesioned so:
                    return 0.0

                else:
                    # lesion the system, calculate things
                    lesioned = graph.copy()
                    lesioned.remove_nodes_from(complements)
                    return nx.local_efficiency(lesioned)
contributions_excluding
        objective_function_params (Dict, optional): Kwargs for the objective_function. Defaults to {}.

        multiprocessing_method (str, optional): 
            So far, two methods of parallelization is implemented, 'joblib' and 'ray' and the default method is joblib.
            If using ray tho, you need to decorate your objective function with @ray.remote decorator. Visit their
            documentations to see how to go for it. I guess ray works better on HPC clusters (if they support it tho!)
            and probably doesn't suffer from the sneaky "memory leakage" of joblib. But just by playing around,
            I realized joblib is faster for tasks that are small themselves. Remedies are here:
            https://docs.ray.io/en/latest/auto_examples/tips-for-first-time.html

            Note: Generally, multiprocessing isn't always faster as explained above. Use it when the function itself
            takes some like each game takes longer than 0.5 seconds or so. For example, a function that sleeps for a
            second on a set of 10 elements with 1000 permutations each (1024 games) performs as follows:

                - no parallel: 1020 seccontributions_excluding
                - joblib: 63 sec
                - ray: 65 sec

            That makes sense since I have 16 cores and 1000/16 is around 62. 
            Defaults to 'joblib'.

        rng (Optional[np.random.Generator], optional): Numpy random generator object used for reproducable results. Default is None. Defaults to None.

        random_seed (Optional[int], optional): 
            sets the random seed of the sampling process. Only used when `rng` is None. Default is None. Defaults to None.

        n_parallel_games (int):
            Number of parallel jobs (number of to-be-occupied cores),
            -1 means all CPU cores and 1 means a serial process.
            I suggest using 1 for debugging since things get messy in parallel!


    Raises:
        NotImplementedError: Raises this error in case the contribution is a timeseries or there are
            multiple contributions

    Returns:
        np.ndarray: the interaction matrix
    """
    elements_idx = list(range(len(elements)))

    # create a list of pairs for wich we'll calculate the 2d interaction. By default, all possible pairs are considered unless specified otherwise
    all_pairs = [(elements.index(x), elements.index(y))
                 for x, y in pairs] if pairs else combinations(elements_idx, 2)

    interface_args = {"elements": elements,
                      "n_permutations": n_permutations,
                      "objective_function_params": objective_function_params,
                      "objective_function": objective_function,
                      "multiprocessing_method": multiprocessing_method,
                      "rng": rng,
                      "random_seed": random_seed,
                      "n_parallel_games": n_parallel_games,
                      "lazy": lazy}

    interactions = np.zeros((len(elements), len(elements)))

    # iterate over all the pairs to run interaction_2d
    for x, y in tqdm(all_pairs, desc="Running interface 2d for all pair of nodes:"):
        gammaAB, gammaA, gammaB = interaction_2d(pair=(elements[x], elements[y]),
                                                 **interface_args)
        if not _is_number(gammaAB):
            raise NotImplementedError("`network_interaction_2d` does not work with"
                                      " timeseries or multiscore contributions yet.")
        interactions[x, y] = interactions[y, x] = gammaAB - gammaA - gammaB

    return interactions