Support for source code check #13

GuillaumeDesforges · 2019-05-13T08:15:01Z

Hi, I have an idea for a feature that would really be helpful, especially in a data science experimentation workflow.

Add an boolean argument to the wrapper function, for instance inspect_source. When it is set to True, use the inspect module to look at the source code of the function and use its hash the same way you do for the function arguments.

Would help a lot !

The text was updated successfully, but these errors were encountered:

shaypal5 · 2019-05-30T14:25:50Z

Hi @GuillaumeDesforges !

That's a great idea! I would love helping you write and add this to the package, if you want to see this feature come to life. :)

GuillaumeDesforges · 2019-06-03T13:14:34Z

Thanks @shaypal5 for the enthusiastic reply :)

The feature is a bit tricky to implement. It must be well thought to prevent very dangerous situations.

For instance, announcing that the source code is checked for changes before running the caching operations means that the user will expect modifications to cascade. Say you have

from cachier import cachier


def add_some(x):
  return x + 1

@cachier(inspect_source=True)
def some_heavy_operation(x):
  x = add_some(x)
  return x

def run():
  result = some_heavy_operation(1)
  print(result)

run() # prints 2

When changing the value from 1 to 2 in add_some, recomputation is necessary.

However, we also don't want to systematically check the source code of functions called, especially if they are from a package and do not change, because that would cause a huge overhead.

My guess would be that rather than a boolean parameter inspect_source, it could be preferable to set it to a list of functions and classes to inspect, so that the users can define himself the behaviour.

The more I think about it, the more it feels like a bad idea™ ...

I would be glad to hear your thoughts !

NickCrews · 2020-04-24T08:16:17Z

However, we also don't want to systematically check the source code of functions called, especially if they are from a package and do not change, because that would cause a huge overhead.

In the general case, it would be impossible to follow the chain of functions called and verify that they are the same. This is the Turing problem, you can't test what a program will do without actually running the program.

I would be curious what the exact use case is that you are describing, for instance what inspired you in the first place?

GuillaumeDesforges · 2020-04-24T08:32:03Z

However, we also don't want to systematically check the source code of functions called, especially if they are from a package and do not change, because that would cause a huge overhead.

In the general case, it would be impossible to follow the chain of functions called and verify that they are the same. This is the Turing problem, you can't test what a program will do without actually running the program.

I would be curious what the exact use case is that you are describing, for instance what inspired you in the first place?

Yes, a fully working mechanism wouldn't be possible, but an approximation would be by tracking source code files where possible if ever possibly called, more like what linters do.
Would not be perfect as it would not differentiate function by what they do but how they are written (which is completely different), but would cover most use cases.

The use case is simple. In data science experimentations it is not rare to build brick by brick your experiment, and storing intermediate results helps faster testing the next brick you are building directly on top of the previous functions (instead of writing to some file and loading it manually).

Some tools like DVC provide means to do that, but in a very heavy way in my opinion.

I'm not doing things like that anymore and won't have time to work on it unfortunately.

shaypal5 added the enhancement label May 30, 2019

jmrichardson mentioned this issue Sep 12, 2020

Add function source code to hash #35

Closed

shaypal5 added the complex issue label Nov 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for source code check #13

Support for source code check #13

GuillaumeDesforges commented May 13, 2019

shaypal5 commented May 30, 2019 •

edited

Loading

GuillaumeDesforges commented Jun 3, 2019 •

edited

Loading

NickCrews commented Apr 24, 2020

GuillaumeDesforges commented Apr 24, 2020 •

edited

Loading

Support for source code check #13

Support for source code check #13

Comments

GuillaumeDesforges commented May 13, 2019

shaypal5 commented May 30, 2019 • edited Loading

GuillaumeDesforges commented Jun 3, 2019 • edited Loading

NickCrews commented Apr 24, 2020

GuillaumeDesforges commented Apr 24, 2020 • edited Loading

shaypal5 commented May 30, 2019 •

edited

Loading

GuillaumeDesforges commented Jun 3, 2019 •

edited

Loading

GuillaumeDesforges commented Apr 24, 2020 •

edited

Loading