Interface
A wrapper function to call other related functions internally and produces an easy-to-use pipeline.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_permutations |
int
|
Number of permutations (samples) per element. |
required |
elements |
list
|
List of the players (elements). Can be strings (names), integers (indicies), and tuples. |
required |
objective_function |
Callable
|
The game (in-silico experiment). It should get the complement set and return one numeric value either int or float. This function is just calling it as: objective_function(complement, **objective_function_params) An example using networkx with some tips: (you sometimes need to specify what should happen during edge-cases like an all-lesioned network) def local_efficiency(complements, graph): if len(complements) < 0: # the network is intact so: return nx.local_efficiency(graph)
|
required |
objective_function_params |
Dict
|
Kwargs for the objective_function. |
{}
|
permutation_space |
Optional[list]
|
Already generated permutation space, in case you want to be more reproducible or something and use the same lesion combinations for many metrics. |
None
|
pair |
Optional[Tuple]
|
pair of elements that will always be together in every combination |
None
|
lesioned |
Optional[any]
|
leseioned element that will not be present in any combination |
None
|
multiprocessing_method |
str
|
So far, two methods of parallelization is implemented, 'joblib' and 'ray' and the default method is joblib. If using ray tho, you need to decorate your objective function with @ray.remote decorator. Visit their documentations to see how to go for it. I guess ray works better on HPC clusters (if they support it tho!) and probably doesn't suffer from the sneaky "memory leakage" of joblib. But just by playing around, I realized joblib is faster for tasks that are small themselves. Remedies are here: https://docs.ray.io/en/latest/auto_examples/tips-for-first-time.html Note: Generally, multiprocessing isn't always faster as explained above. Use it when the function itself takes some like each game takes longer than 0.5 seconds or so. For example, a function that sleeps for a second on a set of 10 elements with 1000 permutations each (1024 games) performs as follows:
That makes sense since I have 16 cores and 1000/16 is around 62. |
'joblib'
|
rng |
Optional[Generator]
|
Numpy random generator object used for reproducable results. Default is None. |
None
|
random_seed |
Optional[int]
|
sets the random seed of the sampling process. Only used when |
None
|
n_parallel_games |
int
|
Number of parallel jobs (number of to-be-occupied cores), -1 means all CPU cores and 1 means a serial process. I suggest using 1 for debugging since things get messy in parallel! |
-1
|
lazy |
bool
|
if set to True, objective function will be called lazily instead of calling it all at once and storing the outputs in a dict. Setting it to True saves a lot of memory and might even be faster in certain cases. |
True
|
save_permutations |
bool
|
If set to True, the shapley values are calculated by calculating the running mean of the permutations instead of storing the permutations. This parameter is ignored in case the objective function returns a scaler. |
False
|
dual_progress_bar |
bool
|
If set to true, you will have two progress bars. One parent that will track the permutations, other child that will track the elements. Its ignored in case the mbar is provided |
required |
mbar |
MasterBar
|
A Fastprogress MasterBar. Use it in case you're calling the interface multiple times to have a nester progress bar. |
None
|
Returns:
Type | Description |
---|---|
DataFrame
|
Tuple[pd.DataFrame, Dict, Dict]: shapley_table, contributions, lesion_effects |
Note that contributions and lesion_effects are the same values, addressed differently. For example: If from a set of ABCD removing AC ends with some value x, you can say the contribution of BD=x and the effect of removing AC=x. So the same values are addressed differently in the two returned Dicts. Of course, it makes more sense to compare the lesion effects with the intact system but who am I to judge.
Source code in msapy/msa.py
454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 |
|