From 953e860f31a1abdd728ad769a32ca52f3bb1581d Mon Sep 17 00:00:00 2001 From: Mike McKerns Date: Fri, 28 Jun 2013 09:42:28 -0700 Subject: [PATCH 01/77] Initial commit --- README.md | 4 ++++ 1 file changed, 4 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 00000000..d6049265 --- /dev/null +++ b/README.md @@ -0,0 +1,4 @@ +dill +==== + +serialize all of python From eb9b240744ba7e78b3476b731ec5bf1ddca68112 Mon Sep 17 00:00:00 2001 From: Mike McKerns Date: Thu, 11 Jul 2013 10:26:28 -0700 Subject: [PATCH 02/77] merged changes to README.md from svn --- README.md | 100 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 99 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index d6049265..f1d7b1df 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,102 @@ dill ==== - serialize all of python + +About Dill +---------- +Dill extends python's 'pickle' module for serializing and de-serializing +python objects to the majority of the built-in python types. Serialization +is the process of converting an object to a byte stream, and the inverse +of which is converting a byte stream back to on python object hierarchy. + +Dill provides the user the same interface as the 'pickle' module, and +also includes some additional features. In addition to pickling python +objects, dill provides the ability to save the state of an interpreter +session in a single command. Hence, it would be feasable to save a +interpreter session, close the interpreter, ship the pickled file to +another computer, open a new interpreter, unpickle the session and +thus continue from the 'saved' state of the original interpreter +session. + +Dill can be used to store python objects to a file, but the primary +usage is to send python objects across the network as a byte stream. +Dill is quite flexible, and allows arbitrary user defined classes +and funcitons to be serialized. Thus dill is not intended to be +secure against erroneously or maliciously constructed data. It is +left to the user to decide whether the data they unpickle is from +a trustworthy source. + +Dill is part of pathos, a python framework for heterogenous computing. +Dill is in the early development stages, and any user feedback is +highly appreciated. Contact Mike McKerns [mmckerns at caltech dot edu] +with comments, suggestions, and any bugs you may find. A list of known +issues is maintained at http://trac.mystic.cacr.caltech.edu/project/pathos/query. + + +Major Features +-------------- +Dill can pickle the following standard types:: + * none, type, bool, int, long, float, complex, str, unicode, + * tuple, list, dict, file, buffer, builtin, + * both old and new style classes, + * instances of old and new style classes, + * set, frozenset, array, functions, exceptions + +Dill can also pickle more 'exotic' standard types:: + * functions with yields, nested functions, lambdas + * cell, method, unboundmethod, module, code, + * dictproxy, methoddescriptor, getsetdescriptor, memberdescriptor, + * wrapperdescriptor, xrange, slice, + * notimplemented, ellipsis, quit + +Dill cannot yet pickle these standard types:: + * frame, generator, traceback + +Dill also provides the capability to:: + * save and load python interpreter sessions + +Current Release +--------------- +The latest released version of dill is available from:: + http://trac.mystic.cacr.caltech.edu/project/pathos + +Dill is distributed under a modified BSD license. + +Development Release +------------------- +You can get the latest development release with all the shiny new features at:: + http://dev.danse.us/packages. + +or even better, fork us on our github mirror of the svn trunk:: + https://github.com/uqfoundation + +Citation +-------- +If you use dill to do research that leads to publication, we ask that you +acknowledge use of dill by citing the following in your publication:: + + M.M. McKerns, L. Strand, T. Sullivan, A. Fang, M.A.G. Aivazis, + "Building a framework for predictive science", Proceedings of + the 10th Python in Science Conference, 2011; + http://arxiv.org/pdf/1202.1056 + + Michael McKerns and Michael Aivazis, + "pathos: a framework for heterogeneous computing", 2010- ; + http://trac.mystic.cacr.caltech.edu/project/pathos + +More Information +---------------- +Probably the best way to get started is to look at the tests +that are provide within dill. See `dill.tests` for a set of scripts +that test dill's ability to serialize different python objects. +Since dill conforms to the 'pickle' interface, the examples and +documentation at http://docs.python.org/library/pickle.html also +apply to dill if one will `import dill as pickle`. Dill's source code is also generally well documented, +so further questions may be resolved by inspecting the code itself, or through +browsing the reference manual. For those who like to leap before +they look, you can jump right to the installation instructions. If the aforementioned documents +do not adequately address your needs, please send us feedback. + +Dill is an active research tool. There are a growing number of publications and presentations that +discuss real-world examples and new features of dill in greater detail than presented in the user's guide. +If you would like to share how you use dill in your work, please send us a link. From 3d50bce43d2234406bdf50be6c12ea85588a9b4d Mon Sep 17 00:00:00 2001 From: Mike McKerns Date: Thu, 11 Jul 2013 11:22:37 -0700 Subject: [PATCH 03/77] fixed formatting in README.md --- README.md | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index f1d7b1df..485fab99 100644 --- a/README.md +++ b/README.md @@ -36,24 +36,28 @@ issues is maintained at http://trac.mystic.cacr.caltech.edu/project/pathos/query Major Features -------------- Dill can pickle the following standard types:: - * none, type, bool, int, long, float, complex, str, unicode, - * tuple, list, dict, file, buffer, builtin, - * both old and new style classes, - * instances of old and new style classes, - * set, frozenset, array, functions, exceptions + +* none, type, bool, int, long, float, complex, str, unicode, +* tuple, list, dict, file, buffer, builtin, +* both old and new style classes, +* instances of old and new style classes, +* set, frozenset, array, functions, exceptions Dill can also pickle more 'exotic' standard types:: - * functions with yields, nested functions, lambdas - * cell, method, unboundmethod, module, code, - * dictproxy, methoddescriptor, getsetdescriptor, memberdescriptor, - * wrapperdescriptor, xrange, slice, - * notimplemented, ellipsis, quit + +* functions with yields, nested functions, lambdas +* cell, method, unboundmethod, module, code, +* dictproxy, methoddescriptor, getsetdescriptor, memberdescriptor, +* wrapperdescriptor, xrange, slice, +* notimplemented, ellipsis, quit Dill cannot yet pickle these standard types:: - * frame, generator, traceback + +* frame, generator, traceback Dill also provides the capability to:: - * save and load python interpreter sessions + +* save and load python interpreter sessions Current Release --------------- From d797b0be1ae67a18b0a4a1c865caef97ab714096 Mon Sep 17 00:00:00 2001 From: roryk Date: Wed, 23 Oct 2013 01:04:39 -0400 Subject: [PATCH 04/77] Fixed _IS_PY3 typo. --- dill/dill.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dill/dill.py b/dill/dill.py index 429766a6..84d97025 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -434,7 +434,7 @@ def save_module_dict(pickler, obj): pickler.write('c__builtin__\n__main__\n') elif not is_dill(pickler) and obj is _main_module.__dict__: log.info("D3: Date: Fri, 14 Feb 2014 15:42:12 -0500 Subject: [PATCH 05/77] Set _main_module when extending StockPickler https://github.com/uqfoundation/dill/issues/23 --- dill/dill.py | 8 ++++++++ tests/test_extendpickle.py | 20 ++++++++++++++++++++ 2 files changed, 28 insertions(+) create mode 100644 tests/test_extendpickle.py diff --git a/dill/dill.py b/dill/dill.py index 85c63837..494553ac 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -189,6 +189,10 @@ class Pickler(StockPickler): _byref = False pass + def __init__(self, *args, **kwargs): + StockPickler.__init__(self, *args, **kwargs) + self._main_module = _main_module + class Unpickler(StockUnpickler): """python's Unpickler extended to interpreter sessions and more types""" _main_module = None @@ -200,6 +204,10 @@ def find_class(self, module, name): return StockUnpickler.find_class(self, module, name) pass + def __init__(self, *args, **kwargs): + StockUnpickler.__init__(self, *args, **kwargs) + self._main_module = _main_module + ''' def dispatch_table(): """get the dispatch table of registered types""" diff --git a/tests/test_extendpickle.py b/tests/test_extendpickle.py new file mode 100644 index 00000000..6b37a594 --- /dev/null +++ b/tests/test_extendpickle.py @@ -0,0 +1,20 @@ +import dill as pickle +import StringIO + +def my_fn(x): + return x * 17 + +obj = lambda : my_fn(34) +assert obj() == 578 + +obj_io = StringIO.StringIO() +pickler = pickle.Pickler(obj_io) +pickler.dump(obj) + +obj_str = obj_io.getvalue() + +obj2_io = StringIO.StringIO(obj_str) +unpickler = pickle.Unpickler(obj2_io) +obj2 = unpickler.load() + +assert obj2() == 578 From a05aa499ead9bd389ce3b464e7de4295f8d2e40d Mon Sep 17 00:00:00 2001 From: Bob Fischer Date: Mon, 3 Mar 2014 08:00:33 -0500 Subject: [PATCH 06/77] making this test 3.x compatible --- tests/test_extendpickle.py | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tests/test_extendpickle.py b/tests/test_extendpickle.py index 6b37a594..ea0eea43 100644 --- a/tests/test_extendpickle.py +++ b/tests/test_extendpickle.py @@ -1,5 +1,8 @@ import dill as pickle -import StringIO +try: + from StringIO import StringIO +except ImportError: + from io import BytesIO as StringIO def my_fn(x): return x * 17 From 4ccbdcbc51f447335666d9a2011040d8e74e8614 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 1 Apr 2014 16:53:03 +0100 Subject: [PATCH 07/77] Creates a safe mode for _import_module, which returns None when the module cannot be found --- dill/dill.py | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 24358409..d53ea241 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -404,14 +404,19 @@ def _dict_from_dictproxy(dictproxy): _dict.pop('__weakref__', None) return _dict -def _import_module(import_name): - if '.' in import_name: - items = import_name.split('.') - module = '.'.join(items[:-1]) - obj = items[-1] - else: - return __import__(import_name) - return getattr(__import__(module, None, None, [obj]), obj) +def _import_module(import_name, safe=False): + try: + if '.' in import_name: + items = import_name.split('.') + module = '.'.join(items[:-1]) + obj = items[-1] + else: + return __import__(import_name) + return getattr(__import__(module, None, None, [obj]), obj) + except ImportError: + if safe: + return None + raise def _locate_function(obj, session=False): if obj.__module__ == '__main__': # and session: From a75d42437ba5e0c3554a4b7c71ad9ad2c0482003 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 1 Apr 2014 16:56:48 +0100 Subject: [PATCH 08/77] Fixes assertion errors caused by pickling a decorator function from a module which is not being run as main. --- dill/dill.py | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/dill/dill.py b/dill/dill.py index d53ea241..2bfe1285 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -464,6 +464,12 @@ def save_module_dict(pickler, obj): pickler.write(bytes('c__main__\n__dict__\n', 'UTF-8')) else: pickler.write('c__main__\n__dict__\n') #XXX: works in general? + elif '__name__' in obj and obj != _main_module.__dict__ \ + and obj is getattr(_import_module(obj['__name__'], safe=True), '__dict__', None): + if PYTHON3: + pickler.write(bytes('c%s\n__dict__\n' % obj['__name__'], 'UTF-8')) + else: + pickler.write('c%s\n__dict__\n' % obj['__name__']) else: log.info("D2: Date: Tue, 1 Apr 2014 17:22:48 +0100 Subject: [PATCH 09/77] Change _import_module safe default to True, and simplified _locate_function. --- dill/dill.py | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 2bfe1285..27a807a7 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -404,7 +404,7 @@ def _dict_from_dictproxy(dictproxy): _dict.pop('__weakref__', None) return _dict -def _import_module(import_name, safe=False): +def _import_module(import_name, safe=True): try: if '.' in import_name: items = import_name.split('.') @@ -413,7 +413,7 @@ def _import_module(import_name, safe=False): else: return __import__(import_name) return getattr(__import__(module, None, None, [obj]), obj) - except ImportError: + except (ImportError, AttributeError): if safe: return None raise @@ -421,10 +421,7 @@ def _import_module(import_name, safe=False): def _locate_function(obj, session=False): if obj.__module__ == '__main__': # and session: return False - try: - found = _import_module(obj.__module__ + '.' + obj.__name__) - except: - return False + found = _import_module(obj.__module__ + '.' + obj.__name__) return found is obj @register(CodeType) From 3b05f0ee80f92e1012c9908ea8c7ffbb856d8456 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 1 Apr 2014 17:44:51 +0100 Subject: [PATCH 10/77] Back to safe=False for _import_module --- dill/dill.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 27a807a7..070dbafc 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -404,7 +404,7 @@ def _dict_from_dictproxy(dictproxy): _dict.pop('__weakref__', None) return _dict -def _import_module(import_name, safe=True): +def _import_module(import_name, safe=False): try: if '.' in import_name: items = import_name.split('.') @@ -421,7 +421,7 @@ def _import_module(import_name, safe=True): def _locate_function(obj, session=False): if obj.__module__ == '__main__': # and session: return False - found = _import_module(obj.__module__ + '.' + obj.__name__) + found = _import_module(obj.__module__ + '.' + obj.__name__, safe=True) return found is obj @register(CodeType) From 59664ca99da105019692534fdde18d68c8da6f4a Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 1 Apr 2014 18:24:53 +0100 Subject: [PATCH 11/77] Minor change which fixes some problems with running dill under doctest --- dill/dill.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 070dbafc..919c655d 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -449,13 +449,13 @@ def save_function(pickler, obj): @register(dict) def save_module_dict(pickler, obj): - if is_dill(pickler) and obj is pickler._main_module.__dict__: + if is_dill(pickler) and obj == pickler._main_module.__dict__: log.info("D1: Date: Sat, 26 Apr 2014 16:38:05 +0100 Subject: [PATCH 12/77] Fix problem with pickling files --- dill/dill.py | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index ff468a90..c71e219d 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -92,11 +92,11 @@ def _trace(boolean): SuperType = type(super(Exception, TypeError())) ItemGetterType = type(itemgetter(0)) AttrGetterType = type(attrgetter('__repr__')) -FileType = open(os.devnull, 'rb', buffering=0) -TextWrapperType = open(os.devnull, 'r', buffering=-1) -BufferedRandomType = open(os.devnull, 'r+b', buffering=-1) -BufferedReaderType = open(os.devnull, 'rb', buffering=-1) -BufferedWriterType = open(os.devnull, 'wb', buffering=-1) +FileType = type(open(os.devnull, 'rb', buffering=0)) +TextWrapperType = type(open(os.devnull, 'r', buffering=-1)) +BufferedRandomType = type(open(os.devnull, 'r+b', buffering=-1)) +BufferedReaderType = type(open(os.devnull, 'rb', buffering=-1)) +BufferedWriterType = type(open(os.devnull, 'wb', buffering=-1)) try: from cStringIO import StringIO, InputType, OutputType except ImportError: From 89a2f917abcbe4d0c63cf8891b6d1f53e6a84154 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Sat, 10 May 2014 20:49:27 +0100 Subject: [PATCH 13/77] Fixes small python3 compatibility issue --- dill/source.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dill/source.py b/dill/source.py index 71a48f08..96b2c852 100644 --- a/dill/source.py +++ b/dill/source.py @@ -124,7 +124,7 @@ def findsource(object): else: # not a lambda, just look for the name if name in line: # need to check for decorator... hats = 0 - for _lnum in xrange(lnum-1,-1,-1): + for _lnum in range(lnum-1,-1,-1): if pat2.match(lines[_lnum]): hats += 1 else: break lnum = lnum - hats From 8e241ddf48a4a773556319760370faf3944dc4a6 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 27 May 2014 12:28:14 +0100 Subject: [PATCH 14/77] Allows saving of a modules __dict__ attribute --- dill/dill.py | 11 ++++++++--- tests/test_module.py | 24 ++++++++++++++++++++++++ 2 files changed, 32 insertions(+), 3 deletions(-) create mode 100644 tests/test_module.py diff --git a/dill/dill.py b/dill/dill.py index 9a8d5c45..600691f0 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -756,11 +756,16 @@ def save_weakproxy(pickler, obj): @register(ModuleType) def save_module(pickler, obj): - if is_dill(pickler) and obj is pickler._main_module: + # if a module file name starts with this, it should be a standard module, + # so should be pickled as a reference + prefix = sys.base_prefix if PY3 else sys.prefix + if obj.__name__ not in ("builtins", "dill") \ + and not getattr(obj, "__file__", "").startswith(prefix): log.info("M1: %s" % obj) _main_dict = obj.__dict__.copy() #XXX: better no copy? option to copy? - [_main_dict.pop(item,None) for item in singletontypes] - pickler.save_reduce(__import__, (obj.__name__,), obj=obj, + [_main_dict.pop(item, None) for item in singletontypes + + ["__builtins__", "__loader__"]] + pickler.save_reduce(_import_module, (obj.__name__,), obj=obj, state=_main_dict) else: log.info("M2: %s" % obj) diff --git a/tests/test_module.py b/tests/test_module.py new file mode 100644 index 00000000..bd5cbdcd --- /dev/null +++ b/tests/test_module.py @@ -0,0 +1,24 @@ +#!/usr/bin/env python +# +# Author: Mike McKerns (mmckerns @caltech and @uqfoundation) +# Copyright (c) 2008-2014 California Institute of Technology. +# License: 3-clause BSD. The full license text is available at: +# - http://trac.mystic.cacr.caltech.edu/project/pathos/browser/dill/LICENSE + +import sys +import dill +import test_mixins as module + +module.a = 1234 + +pik_mod = dill.dumps(module) + +module.a = 0 + +# remove module +del sys.modules[module.__name__] +del module + +module = dill.loads(pik_mod) +assert module.a == 1234 +assert module.double_add(1, 2, 3) == 2 * module.fx From 5e57dce84ffe7be7e699af1e2be953d5a65d8435 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 27 May 2014 15:05:50 +0100 Subject: [PATCH 15/77] Add code to clean up --- tests/test_module.py | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/tests/test_module.py b/tests/test_module.py index bd5cbdcd..e3f2f621 100644 --- a/tests/test_module.py +++ b/tests/test_module.py @@ -9,6 +9,9 @@ import dill import test_mixins as module +cached = (module.__cached__ if hasattr(module, "__cached__") + else module.__file__ + "c") + module.a = 1234 pik_mod = dill.dumps(module) @@ -20,5 +23,11 @@ del module module = dill.loads(pik_mod) -assert module.a == 1234 +assert hasattr(module, "a") and module.a == 1234 assert module.double_add(1, 2, 3) == 2 * module.fx + +# clean up +import os +os.remove(cached) +if os.path.exists("__pycache__") and not os.listdir("__pycache__"): + os.removedirs("__pycache__") From 5b459b0ea230c879819056c0e8923cf0ba914353 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 27 May 2014 15:28:44 +0100 Subject: [PATCH 16/77] Fixes error with dealing with modules without a file --- dill/dill.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dill/dill.py b/dill/dill.py index 600691f0..6426c3e6 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -760,7 +760,7 @@ def save_module(pickler, obj): # so should be pickled as a reference prefix = sys.base_prefix if PY3 else sys.prefix if obj.__name__ not in ("builtins", "dill") \ - and not getattr(obj, "__file__", "").startswith(prefix): + and not getattr(obj, "__file__", prefix).startswith(prefix): log.info("M1: %s" % obj) _main_dict = obj.__dict__.copy() #XXX: better no copy? option to copy? [_main_dict.pop(item, None) for item in singletontypes From 7bcf040a7aed66aab73d0c6a60e6af9715e62822 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 27 May 2014 17:45:05 +0100 Subject: [PATCH 17/77] Check for __main__ module when saving --- dill/dill.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/dill/dill.py b/dill/dill.py index 6426c3e6..25ddb23c 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -759,8 +759,9 @@ def save_module(pickler, obj): # if a module file name starts with this, it should be a standard module, # so should be pickled as a reference prefix = sys.base_prefix if PY3 else sys.prefix + std_mod = getattr(obj, "__file__", prefix).startswith(prefix) if obj.__name__ not in ("builtins", "dill") \ - and not getattr(obj, "__file__", prefix).startswith(prefix): + and not std_mod or is_dill(pickler) and obj is pickler._main_module: log.info("M1: %s" % obj) _main_dict = obj.__dict__.copy() #XXX: better no copy? option to copy? [_main_dict.pop(item, None) for item in singletontypes From 7f11913ca8e2339b5fa2fbc24703ddea922f6140 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Wed, 28 May 2014 12:31:05 +0100 Subject: [PATCH 18/77] Add code to clean up when run by python 3 Also clean whitespace --- tests/test_nested.py | 157 +++++++++++++++++++++++-------------------- 1 file changed, 85 insertions(+), 72 deletions(-) diff --git a/tests/test_nested.py b/tests/test_nested.py index 27789a69..eb32e909 100644 --- a/tests/test_nested.py +++ b/tests/test_nested.py @@ -9,97 +9,110 @@ """ import dill as pickle +import math #import pickle # the nested function: pickle should fail here, but dill is ok. def adder(augend): - zero = [0] - def inner(addend): - return addend+augend+zero[0] - return inner + zero = [0] + + def inner(addend): + return addend + augend + zero[0] + return inner # rewrite the nested function using a class: standard pickle should work here. class cadder(object): - def __init__(self,augend): - self.augend = augend - self.zero = [0] - def __call__(self,addend): - return addend+self.augend+self.zero[0] + def __init__(self, augend): + self.augend = augend + self.zero = [0] + + def __call__(self, addend): + return addend + self.augend + self.zero[0] # rewrite again, but as an old-style class class c2adder: - def __init__(self,augend): - self.augend = augend - self.zero = [0] - def __call__(self,addend): - return addend+self.augend+self.zero[0] + def __init__(self, augend): + self.augend = augend + self.zero = [0] + + def __call__(self, addend): + return addend + self.augend + self.zero[0] # some basic stuff -a = [0,1,2] -import math +a = [0, 1, 2] # some basic class stuff class basic(object): - pass + pass + class basic2: - pass + pass if __name__ == '__main__': - x = 5; y = 1 - - # pickled basic stuff - pa = pickle.dumps(a) - pmath = pickle.dumps(math) #XXX: FAILS in pickle - pmap = pickle.dumps(map) - # ... - la = pickle.loads(pa) - lmath = pickle.loads(pmath) - lmap = pickle.loads(pmap) - assert list(map(math.sin,a)) == list(lmap(lmath.sin,la)) - - # pickled basic class stuff - pbasic2 = pickle.dumps(basic2) - _pbasic2 = pickle.loads(pbasic2)() - pbasic = pickle.dumps(basic) - _pbasic = pickle.loads(pbasic)() - - # pickled c2adder - pc2adder = pickle.dumps(c2adder) - pc2add5 = pickle.loads(pc2adder)(x) - assert pc2add5(y) == x+y - - # pickled cadder - pcadder = pickle.dumps(cadder) - pcadd5 = pickle.loads(pcadder)(x) - assert pcadd5(y) == x+y - - # raw adder and inner - add5 = adder(x) - assert add5(y) == x+y - - # pickled adder - padder = pickle.dumps(adder) - padd5 = pickle.loads(padder)(x) - assert padd5(y) == x+y - - # pickled inner - pinner = pickle.dumps(add5) #XXX: FAILS in pickle - p5add = pickle.loads(pinner) - assert p5add(y) == x+y - - # testing moduledict where not __main__ - try: - import test_moduledict - error = None - except: - import sys - error = sys.exc_info()[1] - assert error is None - # clean up - import os - name = 'test_moduledict.py' - if os.path.exists(name) and os.path.exists(name+'c'): os.remove(name+'c') + x = 5 + y = 1 + + # pickled basic stuff + pa = pickle.dumps(a) + pmath = pickle.dumps(math) #XXX: FAILS in pickle + pmap = pickle.dumps(map) + # ... + la = pickle.loads(pa) + lmath = pickle.loads(pmath) + lmap = pickle.loads(pmap) + assert list(map(math.sin, a)) == list(lmap(lmath.sin, la)) + + # pickled basic class stuff + pbasic2 = pickle.dumps(basic2) + _pbasic2 = pickle.loads(pbasic2)() + pbasic = pickle.dumps(basic) + _pbasic = pickle.loads(pbasic)() + + # pickled c2adder + pc2adder = pickle.dumps(c2adder) + pc2add5 = pickle.loads(pc2adder)(x) + assert pc2add5(y) == x+y + + # pickled cadder + pcadder = pickle.dumps(cadder) + pcadd5 = pickle.loads(pcadder)(x) + assert pcadd5(y) == x+y + + # raw adder and inner + add5 = adder(x) + assert add5(y) == x+y + + # pickled adder + padder = pickle.dumps(adder) + padd5 = pickle.loads(padder)(x) + assert padd5(y) == x+y + + # pickled inner + pinner = pickle.dumps(add5) #XXX: FAILS in pickle + p5add = pickle.loads(pinner) + assert p5add(y) == x+y + + # testing moduledict where not __main__ + try: + import test_moduledict + error = None + except: + import sys + error = sys.exc_info()[1] + assert error is None + # clean up + import os + name = 'test_moduledict.py' + if os.path.exists(name) and os.path.exists(name+'c'): + os.remove(name+'c') + + if os.path.exists(name) and hasattr(test_moduledict, "__cached__") \ + and os.path.exists(test_moduledict.__cached__): + os.remove(getattr(test_moduledict, "__cached__")) + + if os.path.exists("__pycache__") and not os.listdir("__pycache__"): + os.removedirs("__pycache__") # EOF From 8ea9759164d2d860cb6292fa41571924b657c74a Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Wed, 28 May 2014 16:49:53 +0100 Subject: [PATCH 19/77] Fix pickling of std(in, out, err) streams --- dill/dill.py | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 25ddb23c..84777570 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -340,7 +340,7 @@ def _create_filehandle(name, mode, position, closed, open=open): # buffering=0 raise UnpicklingError(err) #XXX: python default is closed '' file/mode if closed: f.close() - else: f.seek(position) + elif position >= 0: f.seek(position) return f def _create_stringi(value, position, closed): @@ -551,12 +551,15 @@ def save_file(pickler, obj): if obj.closed: position = None else: - position = obj.tell() + if obj in (sys.__stdout__, sys.__stderr__, sys.__stdin__): + position = -1 + else: + position = obj.tell() pickler.save_reduce(_create_filehandle, (obj.name, obj.mode, position, \ obj.closed), obj=obj) return -if PyTextWrapperType: +if PyTextWrapperType: #XXX: are stdout, stderr or stdin ever _pyio files? @register(PyBufferedRandomType) @register(PyBufferedReaderType) @register(PyBufferedWriterType) From 848e49d9a46a8c721cd15d075225bd2a10258ed4 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Wed, 4 Jun 2014 10:01:55 +0100 Subject: [PATCH 20/77] Fix pickling of partials when there are no kwargs --- dill/dill.py | 4 ++++ tests/test_functors.py | 28 ++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+) create mode 100644 tests/test_functors.py diff --git a/dill/dill.py b/dill/dill.py index 84777570..806f1a6a 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -310,6 +310,10 @@ def _create_function(fcode, fglobals, fname=None, fdefaults=None, \ return func def _create_ftype(ftypeobj, func, args, kwds): + if kwds is None: + kwds = {} + if args is None: + args = () return ftypeobj(func, *args, **kwds) def _create_lock(locked, *args): diff --git a/tests/test_functors.py b/tests/test_functors.py new file mode 100644 index 00000000..ac051fa7 --- /dev/null +++ b/tests/test_functors.py @@ -0,0 +1,28 @@ +#!/usr/bin/env python +# +# Author: Mike McKerns (mmckerns @caltech and @uqfoundation) +# Copyright (c) 2008-2014 California Institute of Technology. +# License: 3-clause BSD. The full license text is available at: +# - http://trac.mystic.cacr.caltech.edu/project/pathos/browser/dill/LICENSE + +import functools +import dill + +def f(a, b, c): # without keywords + pass + +def g(a, b, c=2): # with keywords + pass + +def h(a=1, b=2, c=3): # without args + pass + +fp = functools.partial(f, 1, 2) +gp = functools.partial(g, 1, c=2) +hp = functools.partial(h, 1, c=2) +bp = functools.partial(int, base=2) + +assert dill.pickles(fp, safe=True) +assert dill.pickles(gp, safe=True) +assert dill.pickles(hp, safe=True) +assert dill.pickles(bp, safe=True) From d97de9e3b5e53ba1a0d2725b5f5dd7ed2cc6e5c6 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Wed, 11 Jun 2014 16:49:06 +0100 Subject: [PATCH 21/77] Add dill.memorise to allow pickling of changed attrs only --- dill/dill.py | 22 ++--- dill/memorise.py | 239 +++++++++++++++++++++++++++++++++++++++++++++ tests/test_memo.py | 61 ++++++++++++ 3 files changed, 310 insertions(+), 12 deletions(-) create mode 100644 dill/memorise.py create mode 100644 tests/test_memo.py diff --git a/dill/dill.py b/dill/dill.py index 6dc46069..2fe26cb4 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -30,6 +30,10 @@ def _trace(boolean): import os import sys +try: + from . import memorise +except ImportError: + import memorise PY3 = (hex(sys.hexversion) >= '0x30000f0') if PY3: #XXX: get types from dill.objtypes ? import builtins as __builtin__ @@ -793,21 +797,15 @@ def save_weakproxy(pickler, obj): @register(ModuleType) def save_module(pickler, obj): - # if a module file name starts with this, it should be a standard module, - # so should be pickled as a reference - prefix = sys.base_prefix if PY3 else sys.prefix - std_mod = getattr(obj, "__file__", prefix).startswith(prefix) - if obj.__name__ not in ("builtins", "dill") \ - and not std_mod or is_dill(pickler) and obj is pickler._main_module: + try: + _main_dict = memorise.whats_changed(obj)[0] + except RuntimeError: # not memorised module, probably part of dill + log.info("M2: %s" % obj) + pickler.save_reduce(_import_module, (obj.__name__,), obj=obj) + else: log.info("M1: %s" % obj) - _main_dict = obj.__dict__.copy() #XXX: better no copy? option to copy? - [_main_dict.pop(item, None) for item in singletontypes - + ["__builtins__", "__loader__"]] pickler.save_reduce(_import_module, (obj.__name__,), obj=obj, state=_main_dict) - else: - log.info("M2: %s" % obj) - pickler.save_reduce(_import_module, (obj.__name__,), obj=obj) return @register(TypeType) diff --git a/dill/memorise.py b/dill/memorise.py new file mode 100644 index 00000000..9ce57274 --- /dev/null +++ b/dill/memorise.py @@ -0,0 +1,239 @@ +#!/usr/bin/env python +# +# Author: Mike McKerns (mmckerns @caltech and @uqfoundation) +# Copyright (c) 2008-2014 California Institute of Technology. +# License: 3-clause BSD. The full license text is available at: +# - http://trac.mystic.cacr.caltech.edu/project/pathos/browser/dill/LICENSE + +""" +Module to show if an object has changed since it was memorised +""" + +import io +import os +import sys +import gc + +try: + from collections.abc import MutableSequence + import builtins +except ImportError: + from collections import MutableSequence + import __builtin__ as builtins +import types +# memo of objects indexed by id to a tuple (attributes, sequence items) +# attributes is a dict indexed by attribute name to attribute id +# sequence items is either a list of ids, of a dictionary of keys to ids +memo = {} +id_to_obj = {} +# types that +builtins_types = {str, list, dict, set, frozenset, int} + + +def get_attrs(obj): + """ + Gets all the attributes of an object though its __dict__ or return None + """ + if type(obj) in builtins_types \ + or type(obj) is type and obj in builtins_types: + return None + return obj.__dict__ if hasattr(obj, "__dict__") else None + + +def get_seq(obj): + """ + Gets all the items in a sequence or return None + """ + if type(obj) in (str, frozenset): + return None + elif isinstance(obj, dict): + return obj + elif isinstance(obj, MutableSequence): + try: + if len(obj): + return list(iter(obj)) + else: + return [] + except: + return None + return None + + +def get_attrs_id(obj): + """ + Gets the ids of an object's attributes though its __dict__ or return None + """ + if type(obj) in builtins_types \ + or type(obj) is type and obj in builtins_types: + return None + return {key: id(value) for key, value in obj.__dict__.items()} \ + if hasattr(obj, "__dict__") else None + + +def get_seq_id(obj, done=None): + """ + Gets the ids of the items in a sequence or return None + """ + if done is not None: + g = done + else: + g = get_seq(obj) + if g is None: + return None + if isinstance(obj, dict): + return {id(key): id(value) for key, value in g.items()} + return [id(i) for i in g] + + +def memorise(obj, force=False, first=True): + """ + Adds an object to the memo, and recursively adds all the objects + attributes, and if it is a container, its items. Use force=True to update + an object already in the memo. Updating is not recursively done. + """ + if first: + # add actions here + pass + if id(obj) in memo and not force: + return + if obj is memo or obj is id_to_obj: + return + g = get_attrs(obj) + s = get_seq(obj) + memo[id(obj)] = get_attrs_id(obj), get_seq_id(obj, done=s) + id_to_obj[id(obj)] = obj + if g is not None: + for key, value in g.items(): + memorise(value, first=False) + if s is not None: + if isinstance(s, dict): + for key, item in s.items(): + memorise(key, first=False) + memorise(item, first=False) + else: + for item in s: + memorise(item, first=False) + + +def release_gone(): + rm = [id_ for id_, obj in id_to_obj.items() if sys.getrefcount(obj) < 4] + for id_ in rm: + del id_to_obj[id_] + del memo[id_] + + +def cmp_seq(obj, seen): + """ + Compares the contents of a container against the version stored in the + memo. Return True if they compare equal, False otherwise. + """ + obj_seq = memo[id(obj)][1] + items = get_seq(obj) + if items is not None: + if len(items) != len(obj_seq): + return False + if isinstance(obj, dict): + for key, item in items.items(): + key_id = id(key) + item_id = id(item) + if key_id not in obj_seq: + return False + if item_id != obj_seq[key_id] \ + or has_changed(key, seen, first=False) \ + or has_changed(item, seen, first=False): + return False + else: + for i, j in zip(items, obj_seq): + if id(i) != j or has_changed(i, seen, first=False): + return False + return True + + +def cmp_attrs(obj, seen, fast=False): + if not fast: + changed_things = {} + attrs = get_attrs(obj) + if attrs is not None: + for key in memo[id(obj)][0]: + if key not in attrs: + if fast: + return False + changed_things[key] = None + for key, o in attrs.items(): + if key not in memo[id(obj)][0] \ + or id(o) != memo[id(obj)][0][key] \ + or has_changed(o, seen, first=False): + if fast: + return False + changed_things[key] = o + return True if fast else changed_things + + +def first_time_only(seen, obj): + # ignore the _ variable, which only appears in interactive sessions + if hasattr(builtins, "_"): + memo[id(builtins)][0]["_"] = id(builtins._) + memorise(builtins._, force=True) + memo[id(builtins.__dict__)][1][id("_")] = id(builtins._) + + +def common_code(seen, obj): + if obj is memo or obj is sys.modules or obj is sys.path_importer_cache \ + or obj is os.environ or obj is id_to_obj: + return False + if id(obj) in seen: + return False + seen.add(id(obj)) + + +def has_changed(obj, seen=None, first=True): + """ + Check an object against the memo. Returns True if the object has changed + since memorisation, False otherwise. + """ + seen = set() if seen is None else seen + if first: + first_time_only(seen, obj) + r = common_code(seen, obj) + if r is not None: + return r + if id(obj) not in memo: + return True + return not cmp_attrs(obj, seen, fast=True) or not cmp_seq(obj, seen) + + +def whats_changed(obj, seen=None, first=True): + """ + Check an object against the memo. Returns a tuple in the form + (attribute changes, container changed). Attribute changes is a dict of + attribute name to attribute value. container changed is a boolean. + """ + seen = set() if seen is None else seen + if first: + first_time_only(seen, obj) + r = common_code(seen, obj) + if r is not None: + return ({}, False) + if id(obj) not in memo: + raise RuntimeError("Object not memorised " + str(obj)) + return cmp_attrs(obj, seen), not cmp_seq(obj, seen) + +__import__ = __import__ + + +def _imp(*args, **kwargs): + """ + Replaces the default __import__, to allow a module to be memorised + before the user can change it + """ + mod = __import__(*args, **kwargs) + memorise(mod) + return mod + +builtins.__import__ = _imp + +# memorise all already imported modules. This implies that this must be +# imported first for any changes to be recorded +for mod in sys.modules.values(): + memorise(mod, first=False) +release_gone() diff --git a/tests/test_memo.py b/tests/test_memo.py new file mode 100644 index 00000000..4ce612f5 --- /dev/null +++ b/tests/test_memo.py @@ -0,0 +1,61 @@ +#!/usr/bin/env python +# +# Author: Mike McKerns (mmckerns @caltech and @uqfoundation) +# Copyright (c) 2008-2014 California Institute of Technology. +# License: 3-clause BSD. The full license text is available at: +# - http://trac.mystic.cacr.caltech.edu/project/pathos/browser/dill/LICENSE + +from dill import memorise as m + + +class A: + pass + +a = A() +b = A() +c = A() +a.a = b +b.a = c +m.memorise(a) +assert not m.has_changed(a) +c.a = 1 +assert m.has_changed(a) +m.memorise(c, force=True) +assert not m.has_changed(a) +c.a = 2 +assert m.has_changed(a) +changed = m.whats_changed(a) +assert list(changed[0].keys()) == ["a"] +assert not changed[1] + +a2 = [] +b2 = [a2] +c2 = [b2] +m.memorise(c2) +assert not m.has_changed(c2) +a2.append(1) +assert m.has_changed(c2) +changed = m.whats_changed(c2) +assert changed[0] == {} +assert changed[1] + +a3 = {} +b3 = {1: a3} +c3 = {1: b3} +m.memorise(c3) +assert not m.has_changed(c3) +a3[1] = 1 +assert m.has_changed(c3) +changed = m.whats_changed(c3) +assert changed[0] == {} +assert changed[1] + +import abc +# make sure that the "_abc_invaldation_counter" does not cause the test to fail +m.memorise(abc.ABCMeta, force=True) +assert not m.has_changed(abc) +abc.ABCMeta.zzz = 1 +assert m.has_changed(abc) +changed = m.whats_changed(abc) +assert list(changed[0].keys()) == ["ABCMeta"] +assert not changed[1] From 72dacaf715b7b6375d96b33f5e3503b129fbac6a Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Thu, 12 Jun 2014 11:08:47 +0100 Subject: [PATCH 22/77] Optimisations for dill.memorise --- dill/dill.py | 6 +- dill/memorise.py | 200 ++++++++++++++++++++++------------------------- 2 files changed, 99 insertions(+), 107 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 2fe26cb4..ca753e3d 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -240,6 +240,7 @@ class Pickler(StockPickler): def __init__(self, *args, **kwargs): StockPickler.__init__(self, *args, **kwargs) self._main_module = _main_module + self._dill_memorise_cashe = {} class Unpickler(StockUnpickler): """python's Unpickler extended to interpreter sessions and more types""" @@ -798,14 +799,15 @@ def save_weakproxy(pickler, obj): @register(ModuleType) def save_module(pickler, obj): try: - _main_dict = memorise.whats_changed(obj)[0] + changed = memorise.whats_changed(obj, + seen=pickler._dill_memorise_cashe)[0] except RuntimeError: # not memorised module, probably part of dill log.info("M2: %s" % obj) pickler.save_reduce(_import_module, (obj.__name__,), obj=obj) else: log.info("M1: %s" % obj) pickler.save_reduce(_import_module, (obj.__name__,), obj=obj, - state=_main_dict) + state=changed) return @register(TypeType) diff --git a/dill/memorise.py b/dill/memorise.py index 9ce57274..deacadce 100644 --- a/dill/memorise.py +++ b/dill/memorise.py @@ -9,24 +9,20 @@ Module to show if an object has changed since it was memorised """ -import io import os import sys -import gc - +import numpy try: - from collections.abc import MutableSequence import builtins except ImportError: - from collections import MutableSequence import __builtin__ as builtins -import types + # memo of objects indexed by id to a tuple (attributes, sequence items) # attributes is a dict indexed by attribute name to attribute id # sequence items is either a list of ids, of a dictionary of keys to ids memo = {} id_to_obj = {} -# types that +# types that cannot have changing attributes builtins_types = {str, list, dict, set, frozenset, int} @@ -40,22 +36,31 @@ def get_attrs(obj): return obj.__dict__ if hasattr(obj, "__dict__") else None -def get_seq(obj): +def get_seq(obj, cashe={str: False, frozenset: False, list: True, set: True, + dict: True, tuple: True}): """ Gets all the items in a sequence or return None """ - if type(obj) in (str, frozenset): + o_type = type(obj) + if o_type in (numpy.ndarray, numpy.ma.core.MaskedConstant): + if obj.shape and obj.size: + return obj + else: + return [] + if o_type in cashe: + if cashe[o_type]: + if hasattr(obj, "copy"): + return obj.copy() + return obj return None - elif isinstance(obj, dict): + elif hasattr(obj, "__contains__") and hasattr(obj, "__iter__") \ + and hasattr(obj, "__len__") and hasattr(o_type, "__contains__") \ + and hasattr(o_type, "__iter__") and hasattr(o_type, "__len__"): + cashe[o_type] = True + if hasattr(obj, "copy"): + return obj.copy() return obj - elif isinstance(obj, MutableSequence): - try: - if len(obj): - return list(iter(obj)) - else: - return [] - except: - return None + cashe[o_type] = None return None @@ -80,7 +85,7 @@ def get_seq_id(obj, done=None): g = get_seq(obj) if g is None: return None - if isinstance(obj, dict): + if hasattr(g, "items"): return {id(key): id(value) for key, value in g.items()} return [id(i) for i in g] @@ -106,7 +111,7 @@ def memorise(obj, force=False, first=True): for key, value in g.items(): memorise(value, first=False) if s is not None: - if isinstance(s, dict): + if hasattr(s, "items"): for key, item in s.items(): memorise(key, first=False) memorise(item, first=False) @@ -122,101 +127,84 @@ def release_gone(): del memo[id_] -def cmp_seq(obj, seen): +def whats_changed(obj, seen=None, first=True, simple=False): """ - Compares the contents of a container against the version stored in the - memo. Return True if they compare equal, False otherwise. + Check an object against the memo. Returns a tuple in the form + (attribute changes, container changed). Attribute changes is a dict of + attribute name to attribute value. container changed is a boolean. """ - obj_seq = memo[id(obj)][1] + seen = {} if seen is None else seen + if first: + # ignore the _ variable, which only appears in interactive sessions + if "_" in builtins.__dict__: + del builtins._ + + obj_id = id(obj) + if obj_id not in memo: + if simple: + return True + else: + raise RuntimeError("Object not memorised " + str(obj)) + + if obj_id in seen: + if simple: + return any(seen[obj_id]) + return seen[obj_id] + + if any(obj is i for i in (memo, sys.modules, sys.path_importer_cache, + os.environ, id_to_obj)): + seen[obj_id] = ({}, False) + if simple: + return False + return seen[obj_id] + + seen[obj_id] = ({}, False) + + chngd = whats_changed + id_ = id + + # compare attributes + attrs = get_attrs(obj) + if attrs is not None: + obj_attrs = memo[id(obj)][0] + obj_get = obj_attrs.get + changed = {key: None for key in obj_attrs if key not in attrs} + changed.update({key: o for key, o in attrs.items() + if id(o) != obj_get(key) + or chngd(o, seen, first=False, simple=True)}) + else: + changed = {} + + # compare sequence items = get_seq(obj) if items is not None: + seq_diff = False + obj_seq = memo[id(obj)][1] if len(items) != len(obj_seq): - return False - if isinstance(obj, dict): + seq_diff = True + elif hasattr(obj, "items"): + obj_get = obj_seq.get for key, item in items.items(): - key_id = id(key) - item_id = id(item) - if key_id not in obj_seq: - return False - if item_id != obj_seq[key_id] \ - or has_changed(key, seen, first=False) \ - or has_changed(item, seen, first=False): - return False + if id_(item) != obj_get(id_(key)) \ + or chngd(key, seen, first=False, simple=True) \ + or chngd(item, seen, first=False, simple=True): + seq_diff = True + break else: for i, j in zip(items, obj_seq): - if id(i) != j or has_changed(i, seen, first=False): - return False - return True - - -def cmp_attrs(obj, seen, fast=False): - if not fast: - changed_things = {} - attrs = get_attrs(obj) - if attrs is not None: - for key in memo[id(obj)][0]: - if key not in attrs: - if fast: - return False - changed_things[key] = None - for key, o in attrs.items(): - if key not in memo[id(obj)][0] \ - or id(o) != memo[id(obj)][0][key] \ - or has_changed(o, seen, first=False): - if fast: - return False - changed_things[key] = o - return True if fast else changed_things - - -def first_time_only(seen, obj): - # ignore the _ variable, which only appears in interactive sessions - if hasattr(builtins, "_"): - memo[id(builtins)][0]["_"] = id(builtins._) - memorise(builtins._, force=True) - memo[id(builtins.__dict__)][1][id("_")] = id(builtins._) - - -def common_code(seen, obj): - if obj is memo or obj is sys.modules or obj is sys.path_importer_cache \ - or obj is os.environ or obj is id_to_obj: - return False - if id(obj) in seen: - return False - seen.add(id(obj)) - - -def has_changed(obj, seen=None, first=True): - """ - Check an object against the memo. Returns True if the object has changed - since memorisation, False otherwise. - """ - seen = set() if seen is None else seen - if first: - first_time_only(seen, obj) - r = common_code(seen, obj) - if r is not None: - return r - if id(obj) not in memo: - return True - return not cmp_attrs(obj, seen, fast=True) or not cmp_seq(obj, seen) + if id_(i) != j or chngd(i, seen, first=False, simple=True): + seq_diff = True + break + else: + seq_diff = False + seen[obj_id] = changed, seq_diff + if simple: + return changed or seq_diff + return changed, seq_diff -def whats_changed(obj, seen=None, first=True): - """ - Check an object against the memo. Returns a tuple in the form - (attribute changes, container changed). Attribute changes is a dict of - attribute name to attribute value. container changed is a boolean. - """ - seen = set() if seen is None else seen - if first: - first_time_only(seen, obj) - r = common_code(seen, obj) - if r is not None: - return ({}, False) - if id(obj) not in memo: - raise RuntimeError("Object not memorised " + str(obj)) - return cmp_attrs(obj, seen), not cmp_seq(obj, seen) +def has_changed(*args, **kwargs): + return whats_changed(*args, simple=True, **kwargs) __import__ = __import__ @@ -231,6 +219,8 @@ def _imp(*args, **kwargs): return mod builtins.__import__ = _imp +if hasattr(builtins, "_"): + del builtins._ # memorise all already imported modules. This implies that this must be # imported first for any changes to be recorded From 5ea2d6d8e84750b503358f67a3b3dce9dc3470a0 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Thu, 12 Jun 2014 16:03:40 +0100 Subject: [PATCH 23/77] Fix test some test failures caused by dill.memorise --- dill/dill.py | 26 +++++++++++++++----------- dill/memorise.py | 12 +++++++++--- 2 files changed, 24 insertions(+), 14 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index ca753e3d..0db6cf8d 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -240,7 +240,7 @@ class Pickler(StockPickler): def __init__(self, *args, **kwargs): StockPickler.__init__(self, *args, **kwargs) self._main_module = _main_module - self._dill_memorise_cashe = {} + self._memorise_cashe = {} class Unpickler(StockUnpickler): """python's Unpickler extended to interpreter sessions and more types""" @@ -798,16 +798,20 @@ def save_weakproxy(pickler, obj): @register(ModuleType) def save_module(pickler, obj): - try: - changed = memorise.whats_changed(obj, - seen=pickler._dill_memorise_cashe)[0] - except RuntimeError: # not memorised module, probably part of dill - log.info("M2: %s" % obj) - pickler.save_reduce(_import_module, (obj.__name__,), obj=obj) - else: - log.info("M1: %s" % obj) - pickler.save_reduce(_import_module, (obj.__name__,), obj=obj, - state=changed) + if obj.__name__ != "dill": + try: + changed = memorise.whats_changed(obj, + seen=pickler._memorise_cashe)[0] + except RuntimeError: # not memorised module, probably part of dill + pass + else: + log.info("M1: %s" % obj) + pickler.save_reduce(_import_module, (obj.__name__,), obj=obj, + state=changed) + return + + log.info("M2: %s" % obj) + pickler.save_reduce(_import_module, (obj.__name__,), obj=obj) return @register(TypeType) diff --git a/dill/memorise.py b/dill/memorise.py index deacadce..ac3fe62d 100644 --- a/dill/memorise.py +++ b/dill/memorise.py @@ -33,7 +33,10 @@ def get_attrs(obj): if type(obj) in builtins_types \ or type(obj) is type and obj in builtins_types: return None - return obj.__dict__ if hasattr(obj, "__dict__") else None + try: + return obj.__dict__ if hasattr(obj, "__dict__") else None + except: + return None def get_seq(obj, cashe={str: False, frozenset: False, list: True, set: True, @@ -71,8 +74,11 @@ def get_attrs_id(obj): if type(obj) in builtins_types \ or type(obj) is type and obj in builtins_types: return None - return {key: id(value) for key, value in obj.__dict__.items()} \ - if hasattr(obj, "__dict__") else None + try: + return {key: id(value) for key, value in obj.__dict__.items()} \ + if hasattr(obj, "__dict__") else None + except: + return None def get_seq_id(obj, done=None): From 9d4e4ecebe0dd43903128a2900ffa6dbadb8dfa5 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 15 Jul 2014 15:21:57 +0100 Subject: [PATCH 24/77] Add function to revert changes to the pickle dispatch table --- dill/__init__.py | 4 ++-- dill/dill.py | 5 +++++ 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/dill/__init__.py b/dill/__init__.py index 861f0817..52ae77b0 100644 --- a/dill/__init__.py +++ b/dill/__init__.py @@ -24,8 +24,8 @@ """ + __license__ from .dill import dump, dumps, load, loads, dump_session, load_session, \ - Pickler, Unpickler, register, copy, pickle, pickles, HIGHEST_PROTOCOL, \ - DEFAULT_PROTOCOL, PicklingError, UnpicklingError + Pickler, Unpickler, register, revert_extension, copy, pickle, pickles, \ + HIGHEST_PROTOCOL, DEFAULT_PROTOCOL, PicklingError, UnpicklingError from . import source, temp, detect # make sure "trace" is turned off diff --git a/dill/dill.py b/dill/dill.py index 6dc46069..d2ae941e 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -258,6 +258,8 @@ def dispatch_table(): return Pickler.dispatch ''' +pickle_dispatch_copy = StockPickler.dispatch.copy() + def pickle(t, func): """expose dispatch table for user-created extensions""" Pickler.dispatch[t] = func @@ -269,6 +271,9 @@ def proxy(func): return func return proxy +def revert_extension(): + StockPickler.dispatch = pickle_dispatch_copy + def _create_typemap(): import types if PY3: From 23dccbe8a48a1dbf0cf5947d09f14ffd370ca5ec Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 15 Jul 2014 16:37:34 +0100 Subject: [PATCH 25/77] Allow user-defined function to remain in the dispatch table --- dill/dill.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index d2ae941e..61547282 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -258,8 +258,6 @@ def dispatch_table(): return Pickler.dispatch ''' -pickle_dispatch_copy = StockPickler.dispatch.copy() - def pickle(t, func): """expose dispatch table for user-created extensions""" Pickler.dispatch[t] = func @@ -272,7 +270,9 @@ def proxy(func): return proxy def revert_extension(): - StockPickler.dispatch = pickle_dispatch_copy + for type, func in list(StockPickler.dispatch.items()): + if func.__module__ == __name__: + del StockPickler.dispatch[type] def _create_typemap(): import types From 61f8078eae40de61ac5347014a8cd66928c1121e Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 15 Jul 2014 17:55:13 +0100 Subject: [PATCH 26/77] Replace dill function with default funcs for revert_extension --- dill/dill.py | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/dill/dill.py b/dill/dill.py index 61547282..3aada427 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -258,6 +258,8 @@ def dispatch_table(): return Pickler.dispatch ''' +pickle_dispatch_copy = StockPickler.dispatch.copy() + def pickle(t, func): """expose dispatch table for user-created extensions""" Pickler.dispatch[t] = func @@ -273,6 +275,8 @@ def revert_extension(): for type, func in list(StockPickler.dispatch.items()): if func.__module__ == __name__: del StockPickler.dispatch[type] + if type in pickle_dispatch_copy: + StockPickler.dispatch[type] = pickle_dispatch_copy[type] def _create_typemap(): import types From 066a158240c3d2bc3c5fb7cf3ab84b9a215e5d15 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 15 Jul 2014 18:21:37 +0100 Subject: [PATCH 27/77] Rename revert_extension and expose extend --- dill/__init__.py | 11 ++++------- dill/dill.py | 2 +- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/dill/__init__.py b/dill/__init__.py index 52ae77b0..bd3289f0 100644 --- a/dill/__init__.py +++ b/dill/__init__.py @@ -24,8 +24,9 @@ """ + __license__ from .dill import dump, dumps, load, loads, dump_session, load_session, \ - Pickler, Unpickler, register, revert_extension, copy, pickle, pickles, \ - HIGHEST_PROTOCOL, DEFAULT_PROTOCOL, PicklingError, UnpicklingError + Pickler, Unpickler, register, copy, pickle, pickles, HIGHEST_PROTOCOL, \ + DEFAULT_PROTOCOL, PicklingError, UnpicklingError, \ + _revert_extension as revert_extension, _extend as extend from . import source, temp, detect # make sure "trace" is turned off @@ -73,11 +74,7 @@ def load_types(pickleable=True, unpickleable=True): # add corresponding types from objects to types reload(types) -def __extend(): - from .dill import _extend - _extend() - return -__extend(); del __extend +extend() def license(): """print license""" diff --git a/dill/dill.py b/dill/dill.py index 3aada427..c158702b 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -271,7 +271,7 @@ def proxy(func): return func return proxy -def revert_extension(): +def _revert_extension(): for type, func in list(StockPickler.dispatch.items()): if func.__module__ == __name__: del StockPickler.dispatch[type] From 72882e949970d9ed81c7dac360998eecfce23feb Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Wed, 23 Jul 2014 18:42:27 +0100 Subject: [PATCH 28/77] Implement changes in #57: improve behaviour of dumped files --- dill/dill.py | 67 +++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 48 insertions(+), 19 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index c158702b..28424d79 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -138,13 +138,14 @@ def copy(obj, *args, **kwds): """use pickling to 'copy' an object""" return loads(dumps(obj, *args, **kwds)) -def dump(obj, file, protocol=None, byref=False): +def dump(obj, file, protocol=None, byref=False, safe_file=False): """pickle an object to a file""" if protocol is None: protocol = DEFAULT_PROTOCOL pik = Pickler(file, protocol) pik._main_module = _main_module _byref = pik._byref pik._byref = bool(byref) + pik._safe_file = safe_file # hack to catch subclassed numpy array instances if NumpyArrayType and ndarrayinstance(obj): @register(type(obj)) @@ -159,10 +160,10 @@ def save_numpy_array(pickler, obj): pik._byref = _byref return -def dumps(obj, protocol=None, byref=False): +def dumps(obj, protocol=None, byref=False, safe_file=False): """pickle an object to a string""" file = StringIO() - dump(obj, file, protocol, byref) + dump(obj, file, protocol, byref, safe_file) return file.getvalue() def load(file): @@ -231,6 +232,7 @@ class Pickler(StockPickler): _main_module = None _session = False _byref = False + _safe_file = False pass def __init__(self, *args, **kwargs): @@ -355,27 +357,54 @@ def _create_lock(locked, *args): raise UnpicklingError("Cannot acquire lock") return lock -def _create_filehandle(name, mode, position, closed, open=open): # buffering=0 +def _create_filehandle(name, mode, position, closed, open=open, safe=False): # buffering=0 # only pickles the handle, not the file contents... good? or StringIO(data)? # (for file contents see: http://effbot.org/librarybook/copy-reg.htm) # NOTE: handle special cases first (are there more special cases?) names = {'':sys.__stdin__, '':sys.__stdout__, '':sys.__stderr__} #XXX: better fileno=(0,1,2) ? - if name in list(names.keys()): f = names[name] #XXX: safer "f=sys.stdin" - elif name == '': import os; f = os.tmpfile() - elif name == '': import tempfile; f = tempfile.TemporaryFile(mode) + if name in list(names.keys()): + f = names[name] #XXX: safer "f=sys.stdin" + elif name == '': + import os + f = os.tmpfile() + elif name == '': + import tempfile + f = tempfile.TemporaryFile(mode) else: - try: # try to open the file by name # NOTE: has different fileno - f = open(name, mode)#FIXME: missing: *buffering*, encoding,softspace - except IOError: + import os + # Mode translation + # Mode | Unpickled mode + # --------|--------------- + # r | r + # r+ | r+ + # w | r+ + # w+ | r+ + # a | a + # a+ | a+ + + if os.path.exists(name): + mode = mode.replace("w+", "r+") + mode = mode.replace("w", "r+") + elif safe: + raise IOError("File '%s' does not exist" % name) + elif "w" not in mode: + name = os.devnull + if safe: + if position > os.path.getsize(name): + raise IOError("File '%s' is too short" % name) + # try to open the file by name + # NOTE: has different fileno + try: + f = open(name, mode) #FIXME: missing: *buffering*, encoding, softspace + except IOError: err = sys.exc_info()[1] - try: # failing, then use /dev/null #XXX: better to just fail here? - import os; f = open(os.devnull, mode) - except IOError: - raise UnpicklingError(err) - #XXX: python default is closed '' file/mode - if closed: f.close() - elif position >= 0: f.seek(position) + raise UnpicklingError(err) + #XXX: python default is closed '' file/mode + if closed: + f.close() + elif position >= 0: + f.seek(position) return f def _create_stringi(value, position, closed): @@ -599,7 +628,7 @@ def save_file(pickler, obj): else: position = obj.tell() pickler.save_reduce(_create_filehandle, (obj.name, obj.mode, position, \ - obj.closed), obj=obj) + obj.closed, open, pickler._safe_file), obj=obj) return if PyTextWrapperType: #XXX: are stdout, stderr or stdin ever _pyio files? @@ -614,7 +643,7 @@ def save_file(pickler, obj): else: position = obj.tell() pickler.save_reduce(_create_filehandle, (obj.name, obj.mode, position, \ - obj.closed, _open), obj=obj) + obj.closed, _open, pickler._safe_file), obj=obj) return # The following two functions are based on 'saveCStringIoInput' From 58b83b98c90d7663c2f9e9d92257d0ae360b70d0 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Fri, 1 Aug 2014 10:02:08 +0100 Subject: [PATCH 29/77] Add test for files --- tests/test_file.py | 134 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 134 insertions(+) create mode 100644 tests/test_file.py diff --git a/tests/test_file.py b/tests/test_file.py new file mode 100644 index 00000000..2da518e7 --- /dev/null +++ b/tests/test_file.py @@ -0,0 +1,134 @@ +#!/usr/bin/env python +# +# Author: Mike McKerns (mmckerns @caltech and @uqfoundation) +# Copyright (c) 2008-2014 California Institute of Technology. +# License: 3-clause BSD. The full license text is available at: +# - http://trac.mystic.cacr.caltech.edu/project/pathos/browser/dill/LICENSE + +import dill +import random +import os + +fname = "test_file.txt" +rand_chars = list(map(chr, range(32, 255))) + ["\n"] * 40 # bias newline + + +def write_randomness(number=200): + with open(fname, "w") as f: + for i in range(number): + f.write(random.choice(rand_chars)) + + +def throws(op, args, exc): + try: + op(*args) + except exc: + return True + else: + return False + + +def test(testsafefmode=False, kwargs={}): + # file exists, with same contents + # read + + write_randomness() + + f = open(fname, "r") + assert dill.loads(dill.dumps(f, **kwargs)).read() == f.read() + f.close() + + # write + + f = open(fname, "w") + f.write("hello") + f_dumped = dill.dumps(f, **kwargs) + f.close() + f2 = dill.loads(f_dumped) + f2.write(" world!") + f2.close() + + assert open(fname).read() == "hello world!" + + # file exists, with different contents (smaller size) + # read + + write_randomness() + + f = open(fname, "r") + f.read() + f_dumped = dill.dumps(f, **kwargs) + f.close() + write_randomness(number=150) + + if testsafefmode: + assert throws(dill.loads, (f_dumped,), IOError) + else: + f2 = dill.loads(f_dumped) + assert f2.read() == "" + f2.close() + + # write + + write_randomness() + + f = open(fname, "w") + f.write("hello") + f_dumped = dill.dumps(f, **kwargs) + f.close() + + f = open(fname, "w") + f.write("h") + f.close() + + if testsafefmode: + assert throws(dill.loads, (f_dumped,), IOError) + else: + f2 = dill.loads(f_dumped) + f2.write(" world!") + f2.close() + assert open(fname).read() == "h\x00\x00\x00\x00 world!" + + # file does not exist + # read + + write_randomness() + + f = open(fname, "r") + f.read() + f_dumped = dill.dumps(f, **kwargs) + f.close() + + os.remove(fname) + + if testsafefmode: + assert throws(dill.loads, (f_dumped,), IOError) + else: + f2 = dill.loads(f_dumped) + assert f2.read() == "" + f2.close() + + # write + + write_randomness() + + f = open(fname, "w+") + f.write("hello") + f_dumped = dill.dumps(f, **kwargs) + f.close() + + os.remove(fname) + + if testsafefmode: + assert throws(dill.loads, (f_dumped,), IOError) + else: + f2 = dill.loads(f_dumped) + f2.write(" world!") + f2.close() + assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" + +test() +# TODO: switch this on when #57 is closed +# test(True, {"safe_file": True}) +if os.path.exists(fname): + os.remove(fname) From 7044dbe47309cf463e9f2f17e77adb4381209833 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Sat, 2 Aug 2014 09:46:54 +0100 Subject: [PATCH 30/77] Add tests for files with append mode --- tests/test_file.py | 146 ++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 139 insertions(+), 7 deletions(-) diff --git a/tests/test_file.py b/tests/test_file.py index 37a68877..07523fa0 100644 --- a/tests/test_file.py +++ b/tests/test_file.py @@ -8,9 +8,10 @@ import dill import random import os +import string fname = "_test_file.txt" -rand_chars = list(map(chr, range(32, 255))) + ["\n"] * 40 # bias newline +rand_chars = list(string.ascii_letters) + ["\n"] * 40 # bias newline def write_randomness(number=200): @@ -22,6 +23,10 @@ def write_randomness(number=200): return contents +def trunc_file(): + open(fname, "w").close() + + def throws(op, args, exc): try: op(*args) @@ -58,7 +63,7 @@ def test(safefmode=False, kwargs={}): f2tell = f2.tell() f2.write(" world!") f2.close() - + # 1) preserve mode and position #FIXME assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" assert f2mode == fmode @@ -80,6 +85,28 @@ def test(safefmode=False, kwargs={}): # assert f2mode == fmode # assert f2tell == ftell + # append + + trunc_file() + + f = open(fname, "a") + f.write("hello") + f_dumped = dill.dumps(f, **kwargs) + fmode = f.mode + ftell = f.tell() + f.close() + f2 = dill.loads(f_dumped) + f2mode = f2.mode + f2tell = f2.tell() + f2.write(" world!") + f2.close() + + # 1) preserve mode and position + assert open(fname).read() == "hello world!" + assert f2mode == fmode + assert f2tell == ftell + # XXX No other options? + # file exists, with different contents (smaller size) # read @@ -180,6 +207,48 @@ def test(safefmode=False, kwargs={}): # assert f2tell == _ftell f2.close() + # append + + write_randomness() + + f = open(fname, "a") + f.write("hello") + f_dumped = dill.dumps(f, **kwargs) + fmode = f.mode + ftell = f.tell() + f.close() + fstr = open(fname).read() + + f = open(fname, "w") + f.write("h") + _ftell = f.tell() + f.close() + + if safefmode: # throw error if ftell > EOF + assert throws(dill.loads, (f_dumped,), IOError) + else: + f2 = dill.loads(f_dumped) + f2mode = f2.mode + f2tell = f2.tell() + f2.write(" world!") + f2.close() + # 1) preserve mode and position #FIXME + # position of writes cannot be changed on some OSs + assert open(fname).read() == "h world!" + assert f2mode == fmode + # 2) treat as if new filehandle, will truncate file + # assert open(fname).read() == " world!" + # assert f2mode == fmode + # assert f2tell == 0 + # 5) pickle data along with filehandle #XXX: Yikes + # assert open(fname).read() == "hello world!" + # assert f2mode == fmode + # assert f2tell == ftell + # 4) use "r" to read data, then use "w" to write new file + # assert open(fname).read() == "h world!" + # assert f2mode == fmode + f2.close() + # file does not exist # read @@ -200,7 +269,7 @@ def test(safefmode=False, kwargs={}): f2 = dill.loads(f_dumped) assert f2.mode == fmode # 1) preserve mode and position #XXX: ? - assert f2.tell() == ftell # 200 + #assert f2.tell() == ftell # 200 assert f2.read() == "" f2.seek(0) assert f2.read() == "" @@ -275,6 +344,40 @@ def test(safefmode=False, kwargs={}): # assert f2mode == fmode # assert f2tell == 0 + # append + + trunc_file() + + f = open(fname, "a") + f.write("hello") + f_dumped = dill.dumps(f, **kwargs) + ftell = f.tell() + fmode = f.mode + f.close() + + os.remove(fname) + + if safefmode: # throw error if file DNE + assert throws(dill.loads, (f_dumped,), IOError) + else: + f2 = dill.loads(f_dumped) + f2mode = f2.mode + f2tell = f2.tell() + f2.write(" world!") + f2.close() + # 1) preserve mode and position + assert open(fname).read() == " world!" + assert f2mode == fmode + assert f2tell == 5 + # 2) treat as if new filehandle, will truncate file + # assert open(fname).read() == " world!" + # assert f2mode == fmode + # assert f2tell == 0 + # 5) pickle data along with filehandle #XXX: Yikes + # assert open(fname).read() == "hello world!" + # assert f2mode == fmode + # assert f2tell == ftell + # file exists, with different contents (larger size) # read @@ -325,8 +428,6 @@ def test(safefmode=False, kwargs={}): # write - write_randomness() - f = open(fname, "w") f.write("hello") f_dumped = dill.dumps(f, **kwargs) @@ -369,10 +470,41 @@ def test(safefmode=False, kwargs={}): # assert f2tell == ftell f2.close() - # TODO: - # # append + trunc_file() + + f = open(fname, "a") + f.write("hello") + f_dumped = dill.dumps(f, **kwargs) + fmode = f.mode + ftell = f.tell() + fstr = open(fname).read() + + f.write(" and goodbye!") + _ftell = f.tell() + f.close() + + #XXX: no safefmode: no way to be 'safe'? + + f2 = dill.loads(f_dumped) + f2mode = f2.mode + f2tell = f2.tell() + f2.write(" world!") + f2.close() + # 1) preserve mode and position #FIXME + assert open(fname).read() == "hello and goodbye! world!" + assert f2mode == fmode + # 2) treat as if new filehandle, will truncate file + # assert open(fname).read() == " world!" + # assert f2mode == fmode + # assert f2tell == 0 + # 5) pickle data along with filehandle #XXX: Yikes + # assert open(fname).read() == "hello world!" + # assert f2mode == fmode + # assert f2tell == ftell + f2.close() + test() # TODO: switch this on when #57 is closed From 98e79e6acd86e4fd82ea2489cfb7ce1efb7bf47e Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Sat, 2 Aug 2014 16:12:36 +0100 Subject: [PATCH 31/77] Further optimisations and enhancements for memorise --- dill/memorise.py | 183 ++++++++++++++++++++++----------------------- tests/test_memo.py | 18 +++++ 2 files changed, 106 insertions(+), 95 deletions(-) diff --git a/dill/memorise.py b/dill/memorise.py index ac3fe62d..7d71f0b9 100644 --- a/dill/memorise.py +++ b/dill/memorise.py @@ -11,7 +11,12 @@ import os import sys -import numpy +import types +try: + import numpy + HAS_NUMPY = True +except: + HAS_NUMPY = False try: import builtins except ImportError: @@ -24,6 +29,8 @@ id_to_obj = {} # types that cannot have changing attributes builtins_types = {str, list, dict, set, frozenset, int} +dont_memo = {id(i) for i in (memo, sys.modules, sys.path_importer_cache, + os.environ, id_to_obj)} def get_attrs(obj): @@ -32,137 +39,122 @@ def get_attrs(obj): """ if type(obj) in builtins_types \ or type(obj) is type and obj in builtins_types: - return None + return try: - return obj.__dict__ if hasattr(obj, "__dict__") else None + return obj.__dict__ except: - return None + return def get_seq(obj, cashe={str: False, frozenset: False, list: True, set: True, - dict: True, tuple: True}): + dict: True, tuple: True, type: False, + types.ModuleType: False, types.FunctionType: False, + types.BuiltinFunctionType: False}): """ Gets all the items in a sequence or return None """ - o_type = type(obj) - if o_type in (numpy.ndarray, numpy.ma.core.MaskedConstant): - if obj.shape and obj.size: - return obj - else: - return [] + o_type = obj.__class__ + hsattr = hasattr if o_type in cashe: if cashe[o_type]: - if hasattr(obj, "copy"): + if hsattr(obj, "copy"): return obj.copy() return obj - return None - elif hasattr(obj, "__contains__") and hasattr(obj, "__iter__") \ - and hasattr(obj, "__len__") and hasattr(o_type, "__contains__") \ - and hasattr(o_type, "__iter__") and hasattr(o_type, "__len__"): + elif HAS_NUMPY and o_type in (numpy.ndarray, numpy.ma.core.MaskedConstant): + if obj.shape and obj.size: + return obj + else: + return [] + elif hsattr(obj, "__contains__") and hsattr(obj, "__iter__") \ + and hsattr(obj, "__len__") and hsattr(o_type, "__contains__") \ + and hsattr(o_type, "__iter__") and hsattr(o_type, "__len__"): cashe[o_type] = True - if hasattr(obj, "copy"): + if hsattr(obj, "copy"): return obj.copy() return obj - cashe[o_type] = None - return None - - -def get_attrs_id(obj): - """ - Gets the ids of an object's attributes though its __dict__ or return None - """ - if type(obj) in builtins_types \ - or type(obj) is type and obj in builtins_types: - return None - try: - return {key: id(value) for key, value in obj.__dict__.items()} \ - if hasattr(obj, "__dict__") else None - except: - return None - - -def get_seq_id(obj, done=None): - """ - Gets the ids of the items in a sequence or return None - """ - if done is not None: - g = done else: - g = get_seq(obj) - if g is None: + cashe[o_type] = False return None - if hasattr(g, "items"): - return {id(key): id(value) for key, value in g.items()} - return [id(i) for i in g] -def memorise(obj, force=False, first=True): +def memorise(obj, force=False): """ Adds an object to the memo, and recursively adds all the objects attributes, and if it is a container, its items. Use force=True to update an object already in the memo. Updating is not recursively done. """ - if first: - # add actions here - pass - if id(obj) in memo and not force: - return - if obj is memo or obj is id_to_obj: + obj_id = id(obj) + if obj_id in memo and not force or obj_id in dont_memo: return + id_ = id g = get_attrs(obj) + if g is None: + attrs_id = None + else: + attrs_id = {key: id_(value) for key, value in g.items()} + s = get_seq(obj) - memo[id(obj)] = get_attrs_id(obj), get_seq_id(obj, done=s) - id_to_obj[id(obj)] = obj + if s is None: + seq_id = None + elif hasattr(s, "items"): + seq_id = {id_(key): id_(value) for key, value in s.items()} + else: + seq_id = [id_(i) for i in s] + + memo[obj_id] = attrs_id, seq_id + id_to_obj[obj_id] = obj + mem = memorise if g is not None: - for key, value in g.items(): - memorise(value, first=False) + [mem(value) for key, value in g.items()] + if s is not None: if hasattr(s, "items"): - for key, item in s.items(): - memorise(key, first=False) - memorise(item, first=False) + [(mem(key), mem(item)) + for key, item in s.items()] else: - for item in s: - memorise(item, first=False) + [mem(item) for item in s] def release_gone(): - rm = [id_ for id_, obj in id_to_obj.items() if sys.getrefcount(obj) < 4] - for id_ in rm: - del id_to_obj[id_] - del memo[id_] + itop, mp, src = id_to_obj.pop, memo.pop, sys.getrefcount + [(itop(id_), mp(id_)) for id_, obj in list(id_to_obj.items()) + if src(obj) < 4] -def whats_changed(obj, seen=None, first=True, simple=False): +def whats_changed(obj, seen=None, simple=False, first=True): """ - Check an object against the memo. Returns a tuple in the form + Check an object against the memo. Returns a list in the form (attribute changes, container changed). Attribute changes is a dict of attribute name to attribute value. container changed is a boolean. + If simple is true, just returns a boolean. None for either item means + that it has not been checked yet """ - seen = {} if seen is None else seen + # Special cases if first: # ignore the _ variable, which only appears in interactive sessions if "_" in builtins.__dict__: del builtins._ + if seen is None: + seen = {} obj_id = id(obj) - if obj_id not in memo: - if simple: - return True - else: - raise RuntimeError("Object not memorised " + str(obj)) if obj_id in seen: if simple: return any(seen[obj_id]) return seen[obj_id] - if any(obj is i for i in (memo, sys.modules, sys.path_importer_cache, - os.environ, id_to_obj)): - seen[obj_id] = ({}, False) + # Safety checks + if obj_id in dont_memo: + seen[obj_id] = [{}, False] if simple: return False return seen[obj_id] + elif obj_id not in memo: + if simple: + return True + else: + raise RuntimeError("Object not memorised " + str(obj)) seen[obj_id] = ({}, False) @@ -171,38 +163,36 @@ def whats_changed(obj, seen=None, first=True, simple=False): # compare attributes attrs = get_attrs(obj) - if attrs is not None: - obj_attrs = memo[id(obj)][0] + if attrs is None: + changed = {} + else: + obj_attrs = memo[obj_id][0] obj_get = obj_attrs.get changed = {key: None for key in obj_attrs if key not in attrs} - changed.update({key: o for key, o in attrs.items() - if id(o) != obj_get(key) - or chngd(o, seen, first=False, simple=True)}) - else: - changed = {} + for key, o in attrs.items(): + if id_(o) != obj_get(key, None) or chngd(o, seen, True, False): + changed[key] = o # compare sequence items = get_seq(obj) + seq_diff = False if items is not None: - seq_diff = False - obj_seq = memo[id(obj)][1] + obj_seq = memo[obj_id][1] if len(items) != len(obj_seq): seq_diff = True - elif hasattr(obj, "items"): + elif hasattr(obj, "items"): # dict type obj obj_get = obj_seq.get for key, item in items.items(): if id_(item) != obj_get(id_(key)) \ - or chngd(key, seen, first=False, simple=True) \ - or chngd(item, seen, first=False, simple=True): + or chngd(key, seen, True, False) \ + or chngd(item, seen, True, False): seq_diff = True break else: - for i, j in zip(items, obj_seq): - if id_(i) != j or chngd(i, seen, first=False, simple=True): + for i, j in zip(items, obj_seq): # list type obj + if id_(i) != j or chngd(i, seen, True, False): seq_diff = True break - else: - seq_diff = False seen[obj_id] = changed, seq_diff if simple: return changed or seq_diff @@ -220,8 +210,11 @@ def _imp(*args, **kwargs): Replaces the default __import__, to allow a module to be memorised before the user can change it """ + before = set(sys.modules.keys()) mod = __import__(*args, **kwargs) - memorise(mod) + after = set(sys.modules.keys()).difference(before) + for m in after: + memorise(sys.modules[m]) return mod builtins.__import__ = _imp @@ -231,5 +224,5 @@ def _imp(*args, **kwargs): # memorise all already imported modules. This implies that this must be # imported first for any changes to be recorded for mod in sys.modules.values(): - memorise(mod, first=False) + memorise(mod) release_gone() diff --git a/tests/test_memo.py b/tests/test_memo.py index 4ce612f5..b4f37334 100644 --- a/tests/test_memo.py +++ b/tests/test_memo.py @@ -59,3 +59,21 @@ class A: changed = m.whats_changed(abc) assert list(changed[0].keys()) == ["ABCMeta"] assert not changed[1] + + +a = A() +b = A() +c = A() +a.a = b +b.a = c +m.memorise(a) +assert not m.has_changed(a) +c.a = 1 +assert m.has_changed(a) +m.memorise(c, force=True) +assert not m.has_changed(a) +del c.a +assert m.has_changed(a) +changed = m.whats_changed(a) +assert list(changed[0].keys()) == ["a"] +assert not changed[1] From 9e16901fcd6b2b8b91fafc1d0c43676f1fd341a1 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Sun, 3 Aug 2014 16:07:19 +0100 Subject: [PATCH 32/77] Small python 2 fix --- dill/memorise.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dill/memorise.py b/dill/memorise.py index 7d71f0b9..3cb6d5c6 100644 --- a/dill/memorise.py +++ b/dill/memorise.py @@ -53,7 +53,7 @@ def get_seq(obj, cashe={str: False, frozenset: False, list: True, set: True, """ Gets all the items in a sequence or return None """ - o_type = obj.__class__ + o_type = type(obj) hsattr = hasattr if o_type in cashe: if cashe[o_type]: From ac9ff01eb9fcb45f9167f74366e5a7b99aa1b2fc Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Mon, 4 Aug 2014 09:43:50 +0100 Subject: [PATCH 33/77] Fixes for test and due to test --- dill/dill.py | 6 +++++- tests/test_file.py | 4 ++-- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 28424d79..dcec4927 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -382,13 +382,17 @@ def _create_filehandle(name, mode, position, closed, open=open, safe=False): # b # w+ | r+ # a | a # a+ | a+ + # Note: If the file does not exist, the mode is not translated + + if mode == "x": + mode = "w" if os.path.exists(name): mode = mode.replace("w+", "r+") mode = mode.replace("w", "r+") elif safe: raise IOError("File '%s' does not exist" % name) - elif "w" not in mode: + elif "r" in mode: name = os.devnull if safe: if position > os.path.getsize(name): diff --git a/tests/test_file.py b/tests/test_file.py index 829e9a2f..fe86d3ba 100644 --- a/tests/test_file.py +++ b/tests/test_file.py @@ -335,7 +335,7 @@ def test(safefmode=False, kwargs={}): assert f2tell == ftell # 3) prefer data over filehandle state # assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" - # assert f2mode == 'r+' #FIXME: spec'd as 'r+' but is 'w+' + # assert f2mode == 'w+' # assert f2tell == ftell # 2) treat as if new filehandle, will truncate file # assert open(fname).read() == " world!" @@ -377,7 +377,7 @@ def test(safefmode=False, kwargs={}): f2.close() assert f2mode == fmode # 1) preserve mode and position #XXX: also 3) - assert open(fname).read() == " world!" # 3) FIXME: throws, should not? + assert open(fname).read() == " world!" # 3) assert f2tell == ftell # 2) treat as if new filehandle, will seek(EOF) # assert open(fname).read() == " world!" From 06150ac4100ab60dc25741faf19d07eef5aa86be Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 5 Aug 2014 21:19:14 +0100 Subject: [PATCH 34/77] Change file pickling to avoid \x00 chars --- dill/dill.py | 3 +- tests/test_file.py | 87 +++++++++++++++++++++++----------------------- 2 files changed, 46 insertions(+), 44 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index dcec4927..6a3fe137 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -408,7 +408,8 @@ def _create_filehandle(name, mode, position, closed, open=open, safe=False): # b if closed: f.close() elif position >= 0: - f.seek(position) + eof = os.path.getsize(name) + f.seek(position if position < eof else eof) return f def _create_stringi(value, position, closed): diff --git a/tests/test_file.py b/tests/test_file.py index fe86d3ba..3fd03d14 100644 --- a/tests/test_file.py +++ b/tests/test_file.py @@ -65,17 +65,17 @@ def test(safefmode=False, kwargs={}): f2.close() # 1) preserve mode and position - assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" - assert f2mode == fmode - assert f2tell == ftell + # assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" + # assert f2mode == fmode + # assert f2tell == ftell # 2) treat as if new filehandle, will truncate file # assert open(fname).read() == " world!" # assert f2mode == fmode # assert f2tell == 0 # 3) prefer data over filehandle state - # assert open(fname).read() == "hello world!" - # assert f2mode == 'r+' #XXX: have to decide 'r+', 'a', ...? - # assert f2tell == ftell + assert open(fname).read() == "hello world!" + assert f2mode == 'r+' #XXX: have to decide 'r+', 'a', ...? + assert f2tell == ftell # 4) use "r" to read data, then use "w" to write new file # assert open(fname).read() == "hello world!" # assert f2mode == fmode @@ -135,17 +135,17 @@ def test(safefmode=False, kwargs={}): f2 = dill.loads(f_dumped) assert f2.mode == fmode # 1) preserve mode and position #XXX: ? - assert f2.tell() == ftell # 200 - assert f2.read() == "" - f2.seek(0) - assert f2.read() == _fstr - assert f2.tell() == _flen # 150 - # 3) prefer data over filehandle state # assert f2.tell() == ftell # 200 # assert f2.read() == "" # f2.seek(0) # assert f2.read() == _fstr # assert f2.tell() == _flen # 150 + # 3) prefer data over filehandle state + assert f2.tell() == _flen + assert f2.read() == "" + f2.seek(0) + assert f2.read() == _fstr + assert f2.tell() == _flen # 150 # 4) preserve mode and position, seek(EOF) if ftell > EOF # assert f2.tell() == _flen # 150 # assert f2.read() == "" @@ -190,13 +190,13 @@ def test(safefmode=False, kwargs={}): f2.write(" world!") f2.close() # 1) preserve mode and position - assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" - assert f2mode == fmode - assert f2tell == ftell - # 3) prefer data over filehandle state - # assert open(fname).read() == "h\x00\x00\x00\x00 world!" - # assert f2mode == 'r+' #XXX: have to decide 'r+', 'a', ...? + # assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" + # assert f2mode == fmode # assert f2tell == ftell + # 3) prefer data over filehandle state + assert open(fname).read() == "h world!" + assert f2mode == 'r+' #XXX: have to decide 'r+', 'a', ...? + assert f2tell == _ftell # 2) treat as if new filehandle, will truncate file # assert open(fname).read() == " world!" # assert f2mode == fmode @@ -244,7 +244,7 @@ def test(safefmode=False, kwargs={}): # 1) preserve mode and position # also 3) # position of writes cannot be changed on some OSs assert open(fname).read() == "h world!" - assert f2tell == ftell + assert f2tell == _ftell # 2) treat as if new filehandle, will seek(EOF) # assert open(fname).read() == "h world!" # assert f2tell == _ftell @@ -279,17 +279,18 @@ def test(safefmode=False, kwargs={}): f2 = dill.loads(f_dumped) assert f2.mode == fmode # 1) preserve mode and position #XXX: ? - assert f2.tell() == ftell # 200 - assert f2.read() == "" - f2.seek(0) - assert f2.read() == "" - assert f2.tell() == 0 - # 3) prefer data over filehandle state # assert f2.tell() == ftell # 200 # assert f2.read() == "" # f2.seek(0) # assert f2.read() == "" # assert f2.tell() == 0 + # 3) prefer data over filehandle state + # FIXME: this fails on systems where f2.tell() always returns 0 + # assert f2.tell() == ftell # 200 + assert f2.read() == "" + f2.seek(0) + assert f2.read() == "" + assert f2.tell() == 0 # 5) pickle data along with filehandle # assert f2.tell() == ftell # 200 # assert f2.read() == "" @@ -330,13 +331,13 @@ def test(safefmode=False, kwargs={}): f2.write(" world!") f2.close() # 1) preserve mode and position - assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" - assert f2mode == fmode - assert f2tell == ftell - # 3) prefer data over filehandle state # assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" - # assert f2mode == 'w+' + # assert f2mode == fmode # assert f2tell == ftell + # 3) prefer data over filehandle state + assert open(fname).read() == " world!" + assert f2mode == 'w+' + assert f2tell == 0 # 2) treat as if new filehandle, will truncate file # assert open(fname).read() == " world!" # assert f2mode == fmode @@ -378,7 +379,7 @@ def test(safefmode=False, kwargs={}): assert f2mode == fmode # 1) preserve mode and position #XXX: also 3) assert open(fname).read() == " world!" # 3) - assert f2tell == ftell + assert f2tell == 0 # 2) treat as if new filehandle, will seek(EOF) # assert open(fname).read() == " world!" # assert f2tell == 0 @@ -411,17 +412,17 @@ def test(safefmode=False, kwargs={}): f2 = dill.loads(f_dumped) assert f2.mode == fmode # 1) preserve mode and position #XXX: ? - assert f2.tell() == ftell # 200 - assert f2.read() == _fstr[ftell:] - f2.seek(0) - assert f2.read() == _fstr - assert f2.tell() == _flen # 250 - # 3) prefer data over filehandle state # assert f2.tell() == ftell # 200 # assert f2.read() == _fstr[ftell:] # f2.seek(0) # assert f2.read() == _fstr # assert f2.tell() == _flen # 250 + # 3) prefer data over filehandle state + assert f2.tell() == ftell # 200 + assert f2.read() == _fstr[ftell:] + f2.seek(0) + assert f2.read() == _fstr + assert f2.tell() == _flen # 250 # 4) preserve mode and position, seek(EOF) if ftell > EOF # assert f2.tell() == ftell # 200 # assert f2.read() == _fstr[ftell:] @@ -463,13 +464,13 @@ def test(safefmode=False, kwargs={}): f2.write(" world!") f2.close() # 1) preserve mode and position - assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" - assert f2mode == fmode - assert f2tell == ftell - # 3) prefer data over filehandle state - # assert open(fname).read() == "hello world!odbye!" - # assert f2mode == 'r+' #XXX: have to decide 'r+', 'a', ...? + # assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" + # assert f2mode == fmode # assert f2tell == ftell + # 3) prefer data over filehandle state + assert open(fname).read() == "hello world!odbye!" + assert f2mode == 'r+' #XXX: have to decide 'r+', 'a', ...? + assert f2tell == ftell # 2) treat as if new filehandle, will truncate file # assert open(fname).read() == " world!" # assert f2mode == fmode From a7fbbcf5112afade90cf5dbf0a71ca90fed07b10 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Thu, 14 Aug 2014 11:32:39 +0100 Subject: [PATCH 35/77] Add different file pickling modes --- dill/__init__.py | 3 +- dill/dill.py | 119 +++++++----- tests/test_file.py | 445 +++++++++++++++++++-------------------------- 3 files changed, 265 insertions(+), 302 deletions(-) diff --git a/dill/__init__.py b/dill/__init__.py index bd3289f0..5f8a9940 100644 --- a/dill/__init__.py +++ b/dill/__init__.py @@ -26,7 +26,8 @@ from .dill import dump, dumps, load, loads, dump_session, load_session, \ Pickler, Unpickler, register, copy, pickle, pickles, HIGHEST_PROTOCOL, \ DEFAULT_PROTOCOL, PicklingError, UnpicklingError, \ - _revert_extension as revert_extension, _extend as extend + _revert_extension as revert_extension, _extend as extend, FMODE_NEWHANDLE, \ + FMODE_PRESERVEDATA, FMODE_PICKLECONTENTS from . import source, temp, detect # make sure "trace" is turned off diff --git a/dill/dill.py b/dill/dill.py index 6a3fe137..2a6aa5bc 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -133,12 +133,23 @@ def ndarrayinstance(obj): return False ExitType = type(exit) singletontypes = [] +### File modes +# pickles the file handle, preserving mode, with position of the unpickled +# object as for a new file handle. +FMODE_NEWHANDLE = 0 +# preserves existing data or create file if is does not exist, with +# position = min(pickled position, EOF), and mode which preserves behaviour +FMODE_PRESERVEDATA = 1 +# pickles the file handle, preserving mode and position, as well as the file +# contents +FMODE_PICKLECONTENTS = 2 + ### Shorthands (modified from python2.5/lib/pickle.py) def copy(obj, *args, **kwds): """use pickling to 'copy' an object""" return loads(dumps(obj, *args, **kwds)) -def dump(obj, file, protocol=None, byref=False, safe_file=False): +def dump(obj, file, protocol=None, byref=False, file_mode=FMODE_NEWHANDLE, safe_file=False): """pickle an object to a file""" if protocol is None: protocol = DEFAULT_PROTOCOL pik = Pickler(file, protocol) @@ -146,6 +157,7 @@ def dump(obj, file, protocol=None, byref=False, safe_file=False): _byref = pik._byref pik._byref = bool(byref) pik._safe_file = safe_file + pik._file_mode = file_mode # hack to catch subclassed numpy array instances if NumpyArrayType and ndarrayinstance(obj): @register(type(obj)) @@ -160,10 +172,10 @@ def save_numpy_array(pickler, obj): pik._byref = _byref return -def dumps(obj, protocol=None, byref=False, safe_file=False): +def dumps(obj, protocol=None, byref=False, file_mode=FMODE_NEWHANDLE, safe_file=False): """pickle an object to a string""" file = StringIO() - dump(obj, file, protocol, byref, safe_file) + dump(obj, file, protocol, byref, file_mode, safe_file) return file.getvalue() def load(file): @@ -233,6 +245,7 @@ class Pickler(StockPickler): _session = False _byref = False _safe_file = False + _file_mode = FMODE_NEWHANDLE pass def __init__(self, *args, **kwargs): @@ -357,7 +370,7 @@ def _create_lock(locked, *args): raise UnpicklingError("Cannot acquire lock") return lock -def _create_filehandle(name, mode, position, closed, open=open, safe=False): # buffering=0 +def _create_filehandle(name, mode, position, closed, open, safe, file_mode, fdata): # buffering=0 # only pickles the handle, not the file contents... good? or StringIO(data)? # (for file contents see: http://effbot.org/librarybook/copy-reg.htm) # NOTE: handle special cases first (are there more special cases?) @@ -373,43 +386,57 @@ def _create_filehandle(name, mode, position, closed, open=open, safe=False): # b f = tempfile.TemporaryFile(mode) else: import os - # Mode translation - # Mode | Unpickled mode - # --------|--------------- - # r | r - # r+ | r+ - # w | r+ - # w+ | r+ - # a | a - # a+ | a+ - # Note: If the file does not exist, the mode is not translated if mode == "x": mode = "w" - if os.path.exists(name): + if not os.path.exists(name): + if safe: + raise IOError("File '%s' does not exist" % name) + elif "r" in mode and file_mode != FMODE_PICKLECONTENTS: + name = os.devnull + current_size = 0 + else: + current_size = os.path.getsize(name) + + if file_mode == FMODE_PRESERVEDATA and os.path.exists(name): + # Mode translation + # Mode | Unpickled mode + # --------|--------------- + # r | r + # r+ | r+ + # w | r+ + # w+ | r+ + # a | a + # a+ | a+ + # Note: If the file does not exist, the mode is not translated mode = mode.replace("w+", "r+") mode = mode.replace("w", "r+") - elif safe: - raise IOError("File '%s' does not exist" % name) - elif "r" in mode: - name = os.devnull - if safe: - if position > os.path.getsize(name): + + if position > current_size: + if safe: raise IOError("File '%s' is too short" % name) + elif file_mode == FMODE_PRESERVEDATA: + position = current_size # try to open the file by name # NOTE: has different fileno try: - f = open(name, mode) #FIXME: missing: *buffering*, encoding, softspace + if file_mode == FMODE_PICKLECONTENTS: + f = open(name, mode if "w" in mode else "w") + f.write(fdata) + if "w" not in mode: + f.close() + f = open(name, mode) #FIXME: missing: *buffering*, encoding, softspace + else: + f = open(name, mode) #FIXME: missing: *buffering*, encoding, softspace except IOError: err = sys.exc_info()[1] raise UnpicklingError(err) #XXX: python default is closed '' file/mode if closed: f.close() - elif position >= 0: - eof = os.path.getsize(name) - f.seek(position if position < eof else eof) + elif position >= 0 and file_mode != FMODE_NEWHANDLE: + f.seek(position) return f def _create_stringi(value, position, closed): @@ -616,15 +643,8 @@ def save_attrgetter(pickler, obj): pickler.save_reduce(type(obj), tuple(attrs), obj=obj) return -# __getstate__ explicitly added to raise TypeError when pickling: -# http://www.gossamer-threads.com/lists/python/bugs/871199 -@register(FileType) #XXX: in 3.x has buffer=0, needs different _create? -@register(BufferedRandomType) -@register(BufferedReaderType) -@register(BufferedWriterType) -@register(TextWrapperType) -def save_file(pickler, obj): - log.info("Fi: %s" % obj) +def _save_file(pickler, obj, open_): + obj.flush() if obj.closed: position = None else: @@ -632,24 +652,35 @@ def save_file(pickler, obj): position = -1 else: position = obj.tell() - pickler.save_reduce(_create_filehandle, (obj.name, obj.mode, position, \ - obj.closed, open, pickler._safe_file), obj=obj) + if pickler._file_mode == FMODE_PICKLECONTENTS: + f = open_(obj.name, "r") + fdata = f.read() + f.close() + else: + fdata = "" + pickler.save_reduce(_create_filehandle, (obj.name, obj.mode, position, + obj.closed, open_, pickler._safe_file, + pickler._file_mode, fdata), obj=obj) return -if PyTextWrapperType: #XXX: are stdout, stderr or stdin ever _pyio files? + +@register(FileType) #XXX: in 3.x has buffer=0, needs different _create? +@register(BufferedRandomType) +@register(BufferedReaderType) +@register(BufferedWriterType) +@register(TextWrapperType) +def save_file(pickler, obj): + log.info("Fi: %s" % obj) + return _save_file(pickler, obj, open) + +if PyTextWrapperType: @register(PyBufferedRandomType) @register(PyBufferedReaderType) @register(PyBufferedWriterType) @register(PyTextWrapperType) def save_file(pickler, obj): log.info("Fi: %s" % obj) - if obj.closed: - position = None - else: - position = obj.tell() - pickler.save_reduce(_create_filehandle, (obj.name, obj.mode, position, \ - obj.closed, _open, pickler._safe_file), obj=obj) - return + return _save_file(pickler, obj, _open) # The following two functions are based on 'saveCStringIoInput' # and 'saveCStringIoOutput' from spickle diff --git a/tests/test_file.py b/tests/test_file.py index 3fd03d14..1fc5ce35 100644 --- a/tests/test_file.py +++ b/tests/test_file.py @@ -1,4 +1,4 @@ -#usr/bin/env python +# usr/bin/env python # # Author: Mike McKerns (mmckerns @caltech and @uqfoundation) # Copyright (c) 2008-2014 California Institute of Technology. @@ -36,14 +36,14 @@ def throws(op, args, exc): return False -def test(safefmode=False, kwargs={}): +def test(safe_file, file_mode): # file exists, with same contents # read write_randomness() f = open(fname, "r") - _f = dill.loads(dill.dumps(f, **kwargs)) + _f = dill.loads(dill.dumps(f, safe_file=safe_file, file_mode=file_mode)) assert _f.mode == f.mode assert _f.tell() == f.tell() assert _f.read() == f.read() @@ -54,7 +54,7 @@ def test(safefmode=False, kwargs={}): f = open(fname, "w") f.write("hello") - f_dumped = dill.dumps(f, **kwargs) + f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() @@ -64,26 +64,20 @@ def test(safefmode=False, kwargs={}): f2.write(" world!") f2.close() - # 1) preserve mode and position - # assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" - # assert f2mode == fmode - # assert f2tell == ftell - # 2) treat as if new filehandle, will truncate file - # assert open(fname).read() == " world!" - # assert f2mode == fmode - # assert f2tell == 0 - # 3) prefer data over filehandle state - assert open(fname).read() == "hello world!" - assert f2mode == 'r+' #XXX: have to decide 'r+', 'a', ...? - assert f2tell == ftell - # 4) use "r" to read data, then use "w" to write new file - # assert open(fname).read() == "hello world!" - # assert f2mode == fmode - # assert f2tell == ftell - # 5) pickle data along with filehandle - # assert open(fname).read() == "hello world!" - # assert f2mode == fmode - # assert f2tell == ftell + if file_mode == dill.FMODE_NEWHANDLE: + assert open(fname).read() == " world!" + assert f2mode == fmode + assert f2tell == 0 + elif file_mode == dill.FMODE_PRESERVEDATA: + assert open(fname).read() == "hello world!" + assert f2mode == 'r+' + assert f2tell == ftell + elif file_mode == dill.FMODE_PICKLECONTENTS: + assert open(fname).read() == "hello world!" + assert f2mode == fmode + assert f2tell == ftell + else: + raise RuntimeError("Uncovered file mode!") # append @@ -91,7 +85,7 @@ def test(safefmode=False, kwargs={}): f = open(fname, "a") f.write("hello") - f_dumped = dill.dumps(f, **kwargs) + f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() @@ -102,18 +96,17 @@ def test(safefmode=False, kwargs={}): f2.close() assert f2mode == fmode - # 1) preserve mode and position # also 3) - assert open(fname).read() == "hello world!" - assert f2tell == ftell - # 2) treat as if new filehandle, will seek(EOF) - # assert open(fname).read() == "hello world!" - # assert f2tell == ftell - # 4) use "r" to read data, then use "a" to write new file - # assert open(fname).read() == "hello world!" - # assert f2tell == ftell - # 5) pickle data along with filehandle - # assert open(fname).read() == "hello world!" - # assert f2tell == ftell + if file_mode == dill.FMODE_PRESERVEDATA: + assert open(fname).read() == "hello world!" + assert f2tell == ftell + elif file_mode == dill.FMODE_NEWHANDLE: + assert open(fname).read() == "hello world!" + assert f2tell == ftell + elif file_mode == dill.FMODE_PICKLECONTENTS: + assert open(fname).read() == "hello world!" + assert f2tell == ftell + else: + raise RuntimeError("Uncovered file mode!") # file exists, with different contents (smaller size) # read @@ -122,46 +115,36 @@ def test(safefmode=False, kwargs={}): f = open(fname, "r") fstr = f.read() - f_dumped = dill.dumps(f, **kwargs) + f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() _flen = 150 _fstr = write_randomness(number=_flen) - if safefmode: # throw error if ftell > EOF + if safe_file: # throw error if ftell > EOF assert throws(dill.loads, (f_dumped,), IOError) else: f2 = dill.loads(f_dumped) assert f2.mode == fmode - # 1) preserve mode and position #XXX: ? - # assert f2.tell() == ftell # 200 - # assert f2.read() == "" - # f2.seek(0) - # assert f2.read() == _fstr - # assert f2.tell() == _flen # 150 - # 3) prefer data over filehandle state - assert f2.tell() == _flen - assert f2.read() == "" - f2.seek(0) - assert f2.read() == _fstr - assert f2.tell() == _flen # 150 - # 4) preserve mode and position, seek(EOF) if ftell > EOF - # assert f2.tell() == _flen # 150 - # assert f2.read() == "" - # f2.seek(0) - # assert f2.read() == _fstr - # assert f2.tell() == _flen # 150 - # 2) treat as if new filehandle, will seek(0) - # assert f2.tell() == 0 - # assert f2.read() == _fstr - # assert f2.tell() == _flen # 150 - # 5) pickle data along with filehandle - # assert f2.tell() == ftell # 200 - # assert f2.read() == "" - # f2.seek(0) - # assert f2.read() == fstr - # assert f2.tell() == ftell # 200 + if file_mode == dill.FMODE_PRESERVEDATA: + assert f2.tell() == _flen + assert f2.read() == "" + f2.seek(0) + assert f2.read() == _fstr + assert f2.tell() == _flen # 150 + elif file_mode == dill.FMODE_NEWHANDLE: + assert f2.tell() == 0 + assert f2.read() == _fstr + assert f2.tell() == _flen # 150 + elif file_mode == dill.FMODE_PICKLECONTENTS: + assert f2.tell() == ftell # 200 + assert f2.read() == "" + f2.seek(0) + assert f2.read() == fstr + assert f2.tell() == ftell # 200 + else: + raise RuntimeError("Uncovered file mode!") f2.close() # write @@ -170,7 +153,7 @@ def test(safefmode=False, kwargs={}): f = open(fname, "w") f.write("hello") - f_dumped = dill.dumps(f, **kwargs) + f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() @@ -181,7 +164,7 @@ def test(safefmode=False, kwargs={}): _ftell = f.tell() f.close() - if safefmode: # throw error if ftell > EOF + if safe_file: # throw error if ftell > EOF assert throws(dill.loads, (f_dumped,), IOError) else: f2 = dill.loads(f_dumped) @@ -189,39 +172,29 @@ def test(safefmode=False, kwargs={}): f2tell = f2.tell() f2.write(" world!") f2.close() - # 1) preserve mode and position - # assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" - # assert f2mode == fmode - # assert f2tell == ftell - # 3) prefer data over filehandle state - assert open(fname).read() == "h world!" - assert f2mode == 'r+' #XXX: have to decide 'r+', 'a', ...? - assert f2tell == _ftell - # 2) treat as if new filehandle, will truncate file - # assert open(fname).read() == " world!" - # assert f2mode == fmode - # assert f2tell == 0 - # 5) pickle data along with filehandle - # assert open(fname).read() == "hello world!" - # assert f2mode == fmode - # assert f2tell == ftell - # 4a) use "r" to read data, then use "w" to write new file - # assert open(fname).read() == "h\x00\x00\x00\x00 world!" - # assert f2mode == fmode - # assert f2tell == ftell - # 4b) preserve mode and position, seek(EOF) if ftell > EOF - # assert open(fname).read() == "h world!" - # assert f2mode == fmode - # assert f2tell == _ftell + if file_mode == dill.FMODE_PRESERVEDATA: + assert open(fname).read() == "h world!" + assert f2mode == 'r+' + assert f2tell == _ftell + elif file_mode == dill.FMODE_NEWHANDLE: + assert open(fname).read() == " world!" + assert f2mode == fmode + assert f2tell == 0 + elif file_mode == dill.FMODE_PICKLECONTENTS: + assert open(fname).read() == "hello world!" + assert f2mode == fmode + assert f2tell == ftell + else: + raise RuntimeError("Uncovered file mode!") f2.close() # append - write_randomness() + trunc_file() f = open(fname, "a") f.write("hello") - f_dumped = dill.dumps(f, **kwargs) + f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() @@ -232,7 +205,7 @@ def test(safefmode=False, kwargs={}): _ftell = f.tell() f.close() - if safefmode: # throw error if ftell > EOF + if safe_file: # throw error if ftell > EOF assert throws(dill.loads, (f_dumped,), IOError) else: f2 = dill.loads(f_dumped) @@ -241,22 +214,18 @@ def test(safefmode=False, kwargs={}): f2.write(" world!") f2.close() assert f2mode == fmode - # 1) preserve mode and position # also 3) - # position of writes cannot be changed on some OSs - assert open(fname).read() == "h world!" - assert f2tell == _ftell - # 2) treat as if new filehandle, will seek(EOF) - # assert open(fname).read() == "h world!" - # assert f2tell == _ftell - # 5) pickle data along with filehandle - # assert open(fname).read() == "hello world!" - # assert f2tell == ftell - # 4a) use "r" to read data, then use "a" to write new file - # assert open(fname).read() == "h world!" - # assert f2tell == ftell - # 4b) preserve mode and position, seek(EOF) if ftell > EOF - # assert open(fname).read() == "h world!" - # assert f2tell == _ftell + if file_mode == dill.FMODE_PRESERVEDATA: + # position of writes cannot be changed on some OSs + assert open(fname).read() == "h world!" + assert f2tell == _ftell + elif file_mode == dill.FMODE_NEWHANDLE: + assert open(fname).read() == "h world!" + assert f2tell == _ftell + elif file_mode == dill.FMODE_PICKLECONTENTS: + assert open(fname).read() == "hello world!" + assert f2tell == ftell + else: + raise RuntimeError("Uncovered file mode!") f2.close() # file does not exist @@ -266,47 +235,37 @@ def test(safefmode=False, kwargs={}): f = open(fname, "r") fstr = f.read() - f_dumped = dill.dumps(f, **kwargs) + f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() os.remove(fname) - if safefmode: # throw error if file DNE + if safe_file: # throw error if file DNE assert throws(dill.loads, (f_dumped,), IOError) else: f2 = dill.loads(f_dumped) assert f2.mode == fmode - # 1) preserve mode and position #XXX: ? - # assert f2.tell() == ftell # 200 - # assert f2.read() == "" - # f2.seek(0) - # assert f2.read() == "" - # assert f2.tell() == 0 - # 3) prefer data over filehandle state - # FIXME: this fails on systems where f2.tell() always returns 0 - # assert f2.tell() == ftell # 200 - assert f2.read() == "" - f2.seek(0) - assert f2.read() == "" - assert f2.tell() == 0 - # 5) pickle data along with filehandle - # assert f2.tell() == ftell # 200 - # assert f2.read() == "" - # f2.seek(0) - # assert f2.read() == fstr - # assert f2.tell() == ftell # 200 - # 2) treat as if new filehandle, will seek(0) - # assert f2.tell() == 0 - # assert f2.read() == "" - # assert f2.tell() == 0 - # 4) preserve mode and position, seek(EOF) if ftell > EOF - # assert f2.tell() == 0 - # assert f2.read() == "" - # f2.seek(0) - # assert f2.read() == "" - # assert f2.tell() == 0 + if file_mode == dill.FMODE_PRESERVEDATA: + # FIXME: this fails on systems where f2.tell() always returns 0 + # assert f2.tell() == ftell # 200 + assert f2.read() == "" + f2.seek(0) + assert f2.read() == "" + assert f2.tell() == 0 + elif file_mode == dill.FMODE_PICKLECONTENTS: + assert f2.tell() == ftell # 200 + assert f2.read() == "" + f2.seek(0) + assert f2.read() == fstr + assert f2.tell() == ftell # 200 + elif file_mode == dill.FMODE_NEWHANDLE: + assert f2.tell() == 0 + assert f2.read() == "" + assert f2.tell() == 0 + else: + raise RuntimeError("Uncovered file mode!") f2.close() # write @@ -315,14 +274,14 @@ def test(safefmode=False, kwargs={}): f = open(fname, "w+") f.write("hello") - f_dumped = dill.dumps(f, **kwargs) + f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) ftell = f.tell() fmode = f.mode f.close() os.remove(fname) - if safefmode: # throw error if file DNE + if safe_file: # throw error if file DNE assert throws(dill.loads, (f_dumped,), IOError) else: f2 = dill.loads(f_dumped) @@ -330,30 +289,20 @@ def test(safefmode=False, kwargs={}): f2tell = f2.tell() f2.write(" world!") f2.close() - # 1) preserve mode and position - # assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" - # assert f2mode == fmode - # assert f2tell == ftell - # 3) prefer data over filehandle state - assert open(fname).read() == " world!" - assert f2mode == 'w+' - assert f2tell == 0 - # 2) treat as if new filehandle, will truncate file - # assert open(fname).read() == " world!" - # assert f2mode == fmode - # assert f2tell == 0 - # 5) pickle data along with filehandle - # assert open(fname).read() == "hello world!" - # assert f2mode == fmode - # assert f2tell == ftell - # 4a) use "r" to read data, then use "w" to write new file - # assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" - # assert f2mode == fmode - # assert f2tell == ftell - # 4b) preserve mode and position, seek(EOF) if ftell > EOF - # assert open(fname).read() == " world!" - # assert f2mode == fmode - # assert f2tell == 0 + if file_mode == dill.FMODE_PRESERVEDATA: + assert open(fname).read() == " world!" + assert f2mode == 'w+' + assert f2tell == 0 + elif file_mode == dill.FMODE_NEWHANDLE: + assert open(fname).read() == " world!" + assert f2mode == fmode + assert f2tell == 0 + elif file_mode == dill.FMODE_PICKLECONTENTS: + assert open(fname).read() == "hello world!" + assert f2mode == fmode + assert f2tell == ftell + else: + raise RuntimeError("Uncovered file mode!") # append @@ -361,14 +310,14 @@ def test(safefmode=False, kwargs={}): f = open(fname, "a") f.write("hello") - f_dumped = dill.dumps(f, **kwargs) + f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) ftell = f.tell() fmode = f.mode f.close() os.remove(fname) - if safefmode: # throw error if file DNE + if safe_file: # throw error if file DNE assert throws(dill.loads, (f_dumped,), IOError) else: f2 = dill.loads(f_dumped) @@ -377,21 +326,17 @@ def test(safefmode=False, kwargs={}): f2.write(" world!") f2.close() assert f2mode == fmode - # 1) preserve mode and position #XXX: also 3) - assert open(fname).read() == " world!" # 3) - assert f2tell == 0 - # 2) treat as if new filehandle, will seek(EOF) - # assert open(fname).read() == " world!" - # assert f2tell == 0 - # 5) pickle data along with filehandle - # assert open(fname).read() == "hello world!" - # assert f2tell == ftell - # 4a) use "r" to read data, then use "a" to write new file - # assert open(fname).read() == " world!" - # assert f2tell == ftell - # 4b) preserve mode and position, seek(EOF) if ftell > EOF - # assert open(fname).read() == " world!" - # assert f2tell == 0 + if file_mode == dill.FMODE_PRESERVEDATA: + assert open(fname).read() == " world!" + assert f2tell == 0 + elif file_mode == dill.FMODE_NEWHANDLE: + assert open(fname).read() == " world!" + assert f2tell == 0 + elif file_mode == dill.FMODE_PICKLECONTENTS: + assert open(fname).read() == "hello world!" + assert f2tell == ftell + else: + raise RuntimeError("Uncovered file mode!") # file exists, with different contents (larger size) # read @@ -400,89 +345,72 @@ def test(safefmode=False, kwargs={}): f = open(fname, "r") fstr = f.read() - f_dumped = dill.dumps(f, **kwargs) + f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() _flen = 250 _fstr = write_randomness(number=_flen) - #XXX: no safefmode: no way to be 'safe'? + # XXX: no safe_file: no way to be 'safe'? f2 = dill.loads(f_dumped) assert f2.mode == fmode - # 1) preserve mode and position #XXX: ? - # assert f2.tell() == ftell # 200 - # assert f2.read() == _fstr[ftell:] - # f2.seek(0) - # assert f2.read() == _fstr - # assert f2.tell() == _flen # 250 - # 3) prefer data over filehandle state - assert f2.tell() == ftell # 200 - assert f2.read() == _fstr[ftell:] - f2.seek(0) - assert f2.read() == _fstr - assert f2.tell() == _flen # 250 - # 4) preserve mode and position, seek(EOF) if ftell > EOF - # assert f2.tell() == ftell # 200 - # assert f2.read() == _fstr[ftell:] - # f2.seek(0) - # assert f2.read() == _fstr - # assert f2.tell() == _flen # 250 - # 2) treat as if new filehandle, will seek(0) - # assert f2.tell() == 0 - # assert f2.read() == _fstr - # assert f2.tell() == _flen # 250 - # 5) pickle data along with filehandle - # assert f2.tell() == ftell # 200 - # assert f2.read() == "" - # f2.seek(0) - # assert f2.read() == fstr - # assert f2.tell() == ftell # 200 - f2.close() #XXX: other alternatives? + if file_mode == dill.FMODE_PRESERVEDATA: + assert f2.tell() == ftell # 200 + assert f2.read() == _fstr[ftell:] + f2.seek(0) + assert f2.read() == _fstr + assert f2.tell() == _flen # 250 + elif file_mode == dill.FMODE_NEWHANDLE: + assert f2.tell() == 0 + assert f2.read() == _fstr + assert f2.tell() == _flen # 250 + elif file_mode == dill.FMODE_PICKLECONTENTS: + assert f2.tell() == ftell # 200 + assert f2.read() == "" + f2.seek(0) + assert f2.read() == fstr + assert f2.tell() == ftell # 200 + else: + raise RuntimeError("Uncovered file mode!") + f2.close() # XXX: other alternatives? # write f = open(fname, "w") f.write("hello") - f_dumped = dill.dumps(f, **kwargs) + f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) fmode = f.mode ftell = f.tell() -# f.close() + fstr = open(fname).read() -# f = open(fname, "a") f.write(" and goodbye!") _ftell = f.tell() f.close() - #XXX: no safefmode: no way to be 'safe'? + # XXX: no safe_file: no way to be 'safe'? f2 = dill.loads(f_dumped) f2mode = f2.mode f2tell = f2.tell() f2.write(" world!") f2.close() - # 1) preserve mode and position - # assert open(fname).read() == "\x00\x00\x00\x00\x00 world!" - # assert f2mode == fmode - # assert f2tell == ftell - # 3) prefer data over filehandle state - assert open(fname).read() == "hello world!odbye!" - assert f2mode == 'r+' #XXX: have to decide 'r+', 'a', ...? - assert f2tell == ftell - # 2) treat as if new filehandle, will truncate file - # assert open(fname).read() == " world!" - # assert f2mode == fmode - # assert f2tell == 0 - # 5) pickle data along with filehandle - # assert open(fname).read() == "hello world!" - # assert f2mode == fmode - # assert f2tell == ftell - # 4) use "r" to read data, then use "w" to write new file - # assert open(fname).read() == "hello world!odbye!" - # assert f2mode == fmode - # assert f2tell == ftell + if file_mode == dill.FMODE_PRESERVEDATA: + assert open(fname).read() == "hello world!odbye!" + assert f2mode == 'r+' # XXX: have to decide 'r+', 'a', ...? + assert f2tell == ftell + elif file_mode == dill.FMODE_NEWHANDLE: + assert open(fname).read() == " world!" + assert f2mode == fmode + assert f2tell == 0 + elif file_mode == dill.FMODE_PICKLECONTENTS: + assert open(fname).read() == "hello world!" + assert f2mode == fmode + assert f2tell == ftell + else: + raise RuntimeError("Uncovered file mode!") f2.close() # append @@ -491,7 +419,7 @@ def test(safefmode=False, kwargs={}): f = open(fname, "a") f.write("hello") - f_dumped = dill.dumps(f, **kwargs) + f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) fmode = f.mode ftell = f.tell() fstr = open(fname).read() @@ -500,7 +428,7 @@ def test(safefmode=False, kwargs={}): _ftell = f.tell() f.close() - #XXX: no safefmode: no way to be 'safe'? + # XXX: no safe_file: no way to be 'safe'? f2 = dill.loads(f_dumped) f2mode = f2.mode @@ -508,23 +436,26 @@ def test(safefmode=False, kwargs={}): f2.write(" world!") f2.close() assert f2mode == fmode - # 1) preserve mode and position # also 3) - assert open(fname).read() == "hello and goodbye! world!" - assert f2tell == ftell - # 2) treat as if new filehandle, will seek(EOF) - # assert open(fname).read() == "hello and goodbye! world!" - # assert f2tell == _ftell - # 5) pickle data along with filehandle - # assert open(fname).read() == "hello world!" - # assert f2tell == ftell - # 4) use "r" to read data, then use "a" to write new file - # assert open(fname).read() == "hello and goodbye! world!" - # assert f2tell == ftell + if file_mode == dill.FMODE_PRESERVEDATA: + assert open(fname).read() == "hello and goodbye! world!" + assert f2tell == ftell + elif file_mode == dill.FMODE_NEWHANDLE: + assert open(fname).read() == "hello and goodbye! world!" + assert f2tell == _ftell + elif file_mode == dill.FMODE_PICKLECONTENTS: + assert open(fname).read() == "hello world!" + assert f2tell == ftell + else: + raise RuntimeError("Uncovered file mode!") f2.close() -test() +test(safe_file=False, file_mode=dill.FMODE_NEWHANDLE) +test(safe_file=False, file_mode=dill.FMODE_PRESERVEDATA) +test(safe_file=False, file_mode=dill.FMODE_PICKLECONTENTS) # TODO: switch this on when #57 is closed -# test(True, {"safe_file": True}) +# test(safe_file=True, file_mode=dill.FMODE_NEWHANDLE) +# test(safe_file=True, file_mode=dill.FMODE_PRESERVEDATA) +# test(safe_file=True, file_mode=dill.FMODE_PICKLECONTENTS) if os.path.exists(fname): os.remove(fname) From 087c00899ef55f31d36e7aee51a958b17daf8c91 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Thu, 14 Aug 2014 15:05:41 +0100 Subject: [PATCH 36/77] Change to use os.open to avoid visible mode change --- dill/dill.py | 25 ++++++++----------------- tests/test_file.py | 8 +++++--- 2 files changed, 13 insertions(+), 20 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 2a6aa5bc..b525c80d 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -399,20 +399,6 @@ def _create_filehandle(name, mode, position, closed, open, safe, file_mode, fdat else: current_size = os.path.getsize(name) - if file_mode == FMODE_PRESERVEDATA and os.path.exists(name): - # Mode translation - # Mode | Unpickled mode - # --------|--------------- - # r | r - # r+ | r+ - # w | r+ - # w+ | r+ - # a | a - # a+ | a+ - # Note: If the file does not exist, the mode is not translated - mode = mode.replace("w+", "r+") - mode = mode.replace("w", "r+") - if position > current_size: if safe: raise IOError("File '%s' is too short" % name) @@ -421,18 +407,23 @@ def _create_filehandle(name, mode, position, closed, open, safe, file_mode, fdat # try to open the file by name # NOTE: has different fileno try: + #FIXME: missing: *buffering*, encoding, softspace if file_mode == FMODE_PICKLECONTENTS: f = open(name, mode if "w" in mode else "w") f.write(fdata) if "w" not in mode: f.close() - f = open(name, mode) #FIXME: missing: *buffering*, encoding, softspace + f = open(name, mode) + elif file_mode == FMODE_PRESERVEDATA and "w" in mode: + # stop truncation when opening + def opener(file, flags): + return os.open(file, flags ^ os.O_TRUNC) + f = open(name, mode, opener=opener) else: - f = open(name, mode) #FIXME: missing: *buffering*, encoding, softspace + f = open(name, mode) except IOError: err = sys.exc_info()[1] raise UnpicklingError(err) - #XXX: python default is closed '' file/mode if closed: f.close() elif position >= 0 and file_mode != FMODE_NEWHANDLE: diff --git a/tests/test_file.py b/tests/test_file.py index 1fc5ce35..e3294fac 100644 --- a/tests/test_file.py +++ b/tests/test_file.py @@ -61,6 +61,7 @@ def test(safe_file, file_mode): f2 = dill.loads(f_dumped) f2mode = f2.mode f2tell = f2.tell() + f2name = f2.name f2.write(" world!") f2.close() @@ -70,8 +71,9 @@ def test(safe_file, file_mode): assert f2tell == 0 elif file_mode == dill.FMODE_PRESERVEDATA: assert open(fname).read() == "hello world!" - assert f2mode == 'r+' + assert f2mode == fmode assert f2tell == ftell + assert f2name == fname elif file_mode == dill.FMODE_PICKLECONTENTS: assert open(fname).read() == "hello world!" assert f2mode == fmode @@ -174,7 +176,7 @@ def test(safe_file, file_mode): f2.close() if file_mode == dill.FMODE_PRESERVEDATA: assert open(fname).read() == "h world!" - assert f2mode == 'r+' + assert f2mode == fmode assert f2tell == _ftell elif file_mode == dill.FMODE_NEWHANDLE: assert open(fname).read() == " world!" @@ -399,7 +401,7 @@ def test(safe_file, file_mode): f2.close() if file_mode == dill.FMODE_PRESERVEDATA: assert open(fname).read() == "hello world!odbye!" - assert f2mode == 'r+' # XXX: have to decide 'r+', 'a', ...? + assert f2mode == fmode assert f2tell == ftell elif file_mode == dill.FMODE_NEWHANDLE: assert open(fname).read() == " world!" From 47cc8b0f1b8e895b9b4681cdb260daf84c9097f1 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Thu, 14 Aug 2014 18:14:27 +0100 Subject: [PATCH 37/77] Add file modes to `__all__`, switch on safe_file tests, improve mode x handling and fix shabang --- dill/dill.py | 16 +++++++--------- tests/test_file.py | 11 ++++++----- 2 files changed, 13 insertions(+), 14 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index b525c80d..03a0094d 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -14,10 +14,11 @@ Test against "all" python types (Std. Lib. CH 1-15 @ 2.7) by mmckerns. Test against CH16+ Std. Lib. ... TBD. """ -__all__ = ['dump','dumps','load','loads','dump_session','load_session',\ - 'Pickler','Unpickler','register','copy','pickle','pickles',\ - 'HIGHEST_PROTOCOL','DEFAULT_PROTOCOL',\ - 'PicklingError','UnpicklingError'] +__all__ = ['dump','dumps','load','loads','dump_session','load_session', + 'Pickler','Unpickler','register','copy','pickle','pickles', + 'HIGHEST_PROTOCOL','DEFAULT_PROTOCOL', + 'PicklingError','UnpicklingError','FMODE_NEWHANDLE', + 'FMODE_PRESERVEDATA','FMODE_PICKLECONTENTS'] import logging log = logging.getLogger("dill") @@ -379,16 +380,13 @@ def _create_filehandle(name, mode, position, closed, open, safe, file_mode, fdat if name in list(names.keys()): f = names[name] #XXX: safer "f=sys.stdin" elif name == '': - import os f = os.tmpfile() elif name == '': import tempfile f = tempfile.TemporaryFile(mode) else: - import os - - if mode == "x": - mode = "w" + # treat x mode as w mode + mode = mode.replace("x", "w") if not os.path.exists(name): if safe: diff --git a/tests/test_file.py b/tests/test_file.py index e3294fac..c0252711 100644 --- a/tests/test_file.py +++ b/tests/test_file.py @@ -1,4 +1,4 @@ -# usr/bin/env python +#!/usr/bin/env python # # Author: Mike McKerns (mmckerns @caltech and @uqfoundation) # Copyright (c) 2008-2014 California Institute of Technology. @@ -455,9 +455,10 @@ def test(safe_file, file_mode): test(safe_file=False, file_mode=dill.FMODE_NEWHANDLE) test(safe_file=False, file_mode=dill.FMODE_PRESERVEDATA) test(safe_file=False, file_mode=dill.FMODE_PICKLECONTENTS) -# TODO: switch this on when #57 is closed -# test(safe_file=True, file_mode=dill.FMODE_NEWHANDLE) -# test(safe_file=True, file_mode=dill.FMODE_PRESERVEDATA) -# test(safe_file=True, file_mode=dill.FMODE_PICKLECONTENTS) + +test(safe_file=True, file_mode=dill.FMODE_NEWHANDLE) +test(safe_file=True, file_mode=dill.FMODE_PRESERVEDATA) +test(safe_file=True, file_mode=dill.FMODE_PICKLECONTENTS) + if os.path.exists(fname): os.remove(fname) From 0a2fa3139951a8ef43377413e646f98ebe8907a8 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Mon, 18 Aug 2014 21:08:51 +0100 Subject: [PATCH 38/77] Add backwards compatibility to file handling --- dill/dill.py | 31 ++++++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 03a0094d..c8d76d5b 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -414,9 +414,34 @@ def _create_filehandle(name, mode, position, closed, open, safe, file_mode, fdat f = open(name, mode) elif file_mode == FMODE_PRESERVEDATA and "w" in mode: # stop truncation when opening - def opener(file, flags): - return os.open(file, flags ^ os.O_TRUNC) - f = open(name, mode, opener=opener) + flags = os.O_CREAT + if "+" in mode: + flags |= os.O_RDWR + else: + flags |= os.O_WRONLY + f = os.fdopen(os.open(name, flags), mode) + # set name to the correct value + if PY3: + r = getattr(f, "buffer", f) + r = getattr(r, "raw", r) + r.name = name + else: + class FILE(ctypes.Structure): + _fields_ = [("refcount", ctypes.c_long), + ("type_obj", ctypes.py_object), + ("file_pointer", ctypes.c_voidp), + ("name", ctypes.py_object)] + + class PyObject(ctypes.Structure): + _fields_ = [ + ("ob_refcnt", ctypes.c_int), + ("ob_type", ctypes.py_object) + ] + if not HAS_CTYPES: + raise RuntimeError("Need ctypes to set file name") + ctypes.cast(id(f), ctypes.POINTER(FILE)).contents.name = name + ctypes.cast(id(name), ctypes.POINTER(PyObject)).contents.ob_refcnt += 1 + assert f.name == name else: f = open(name, mode) except IOError: From 4edc3c0b5e181023d6784f1b39d995a51b196fb3 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Wed, 20 Aug 2014 20:21:12 +0100 Subject: [PATCH 39/77] Fix for python 2.5 --- tests/test_file.py | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/tests/test_file.py b/tests/test_file.py index c0252711..8e387ad2 100644 --- a/tests/test_file.py +++ b/tests/test_file.py @@ -15,11 +15,13 @@ def write_randomness(number=200): - with open(fname, "w") as f: - for i in range(number): - f.write(random.choice(rand_chars)) - with open(fname, "r") as f: - contents = f.read() + f = open(fname, "w") + for i in range(number): + f.write(random.choice(rand_chars)) + f.close() + f = open(fname, "r") + contents = f.read() + f.close() return contents From b435bb7b1cd892ca0446e190d06088424795c278 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Wed, 20 Aug 2014 21:19:39 +0100 Subject: [PATCH 40/77] Add extra handling for x mode --- dill/dill.py | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index c8d76d5b..2cfdcb56 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -386,7 +386,8 @@ def _create_filehandle(name, mode, position, closed, open, safe, file_mode, fdat f = tempfile.TemporaryFile(mode) else: # treat x mode as w mode - mode = mode.replace("x", "w") + if "x" in mode and sys.hexversion < 0x03030000: + raise IOError("invalid mode 'x'") if not os.path.exists(name): if safe: @@ -412,7 +413,8 @@ def _create_filehandle(name, mode, position, closed, open, safe, file_mode, fdat if "w" not in mode: f.close() f = open(name, mode) - elif file_mode == FMODE_PRESERVEDATA and "w" in mode: + elif file_mode == FMODE_PRESERVEDATA \ + and "w" in mode or "x" in mode: # stop truncation when opening flags = os.O_CREAT if "+" in mode: From e256d57d5da03a3b4f1d39f36be813dd7cdf2c3c Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Thu, 21 Aug 2014 10:15:24 +0100 Subject: [PATCH 41/77] Change safe_file to a more generic name safeio --- dill/dill.py | 18 ++++++++--------- tests/test_file.py | 50 +++++++++++++++++++++++----------------------- 2 files changed, 34 insertions(+), 34 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index e094b841..201a6c7f 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -151,14 +151,14 @@ def copy(obj, *args, **kwds): """use pickling to 'copy' an object""" return loads(dumps(obj, *args, **kwds)) -def dump(obj, file, protocol=None, byref=False, file_mode=FMODE_NEWHANDLE, safe_file=False): +def dump(obj, file, protocol=None, byref=False, file_mode=FMODE_NEWHANDLE, safeio=False): """pickle an object to a file""" if protocol is None: protocol = DEFAULT_PROTOCOL pik = Pickler(file, protocol) pik._main_module = _main_module _byref = pik._byref pik._byref = bool(byref) - pik._safe_file = safe_file + pik._safeio = safeio pik._file_mode = file_mode # hack to catch subclassed numpy array instances if NumpyArrayType and ndarrayinstance(obj): @@ -174,10 +174,10 @@ def save_numpy_array(pickler, obj): pik._byref = _byref return -def dumps(obj, protocol=None, byref=False, file_mode=FMODE_NEWHANDLE, safe_file=False): +def dumps(obj, protocol=None, byref=False, file_mode=FMODE_NEWHANDLE, safeio=False): """pickle an object to a string""" file = StringIO() - dump(obj, file, protocol, byref, file_mode, safe_file) + dump(obj, file, protocol, byref, file_mode, safeio) return file.getvalue() def load(file): @@ -372,7 +372,7 @@ def _create_lock(locked, *args): raise UnpicklingError("Cannot acquire lock") return lock -def _create_filehandle(name, mode, position, closed, open, safe, file_mode, fdata): # buffering=0 +def _create_filehandle(name, mode, position, closed, open, safeio, file_mode, fdata): # buffering=0 # only pickles the handle, not the file contents... good? or StringIO(data)? # (for file contents see: http://effbot.org/librarybook/copy-reg.htm) # NOTE: handle special cases first (are there more special cases?) @@ -391,7 +391,7 @@ def _create_filehandle(name, mode, position, closed, open, safe, file_mode, fdat raise IOError("invalid mode 'x'") if not os.path.exists(name): - if safe: + if safeio: raise IOError("File '%s' does not exist" % name) elif "r" in mode and file_mode != FMODE_PICKLECONTENTS: name = os.devnull @@ -400,7 +400,7 @@ def _create_filehandle(name, mode, position, closed, open, safe, file_mode, fdat current_size = os.path.getsize(name) if position > current_size: - if safe: + if safeio: raise IOError("File '%s' is too short" % name) elif file_mode == FMODE_PRESERVEDATA: position = current_size @@ -415,7 +415,7 @@ def _create_filehandle(name, mode, position, closed, open, safe, file_mode, fdat f.close() f = open(name, mode) elif file_mode == FMODE_PRESERVEDATA \ - and "w" in mode or "x" in mode: + and ("w" in mode or "x" in mode): # stop truncation when opening flags = os.O_CREAT if "+" in mode: @@ -677,7 +677,7 @@ def _save_file(pickler, obj, open_): fdata = "" pickler.save_reduce(_create_filehandle, (obj.name, obj.mode, position, obj.closed, open_, pickler._safe_file, - pickler._file_mode, fdata), obj=obj) + pickler._safeio, fdata), obj=obj) return diff --git a/tests/test_file.py b/tests/test_file.py index 8e387ad2..e0b5ad0a 100644 --- a/tests/test_file.py +++ b/tests/test_file.py @@ -38,14 +38,14 @@ def throws(op, args, exc): return False -def test(safe_file, file_mode): +def test(safeio, file_mode): # file exists, with same contents # read write_randomness() f = open(fname, "r") - _f = dill.loads(dill.dumps(f, safe_file=safe_file, file_mode=file_mode)) + _f = dill.loads(dill.dumps(f, safeio=safeio, file_mode=file_mode)) assert _f.mode == f.mode assert _f.tell() == f.tell() assert _f.read() == f.read() @@ -56,7 +56,7 @@ def test(safe_file, file_mode): f = open(fname, "w") f.write("hello") - f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) + f_dumped = dill.dumps(f, safeio=safeio, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() @@ -89,7 +89,7 @@ def test(safe_file, file_mode): f = open(fname, "a") f.write("hello") - f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) + f_dumped = dill.dumps(f, safeio=safeio, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() @@ -119,14 +119,14 @@ def test(safe_file, file_mode): f = open(fname, "r") fstr = f.read() - f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) + f_dumped = dill.dumps(f, safeio=safeio, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() _flen = 150 _fstr = write_randomness(number=_flen) - if safe_file: # throw error if ftell > EOF + if safeio: # throw error if ftell > EOF assert throws(dill.loads, (f_dumped,), IOError) else: f2 = dill.loads(f_dumped) @@ -157,7 +157,7 @@ def test(safe_file, file_mode): f = open(fname, "w") f.write("hello") - f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) + f_dumped = dill.dumps(f, safeio=safeio, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() @@ -168,7 +168,7 @@ def test(safe_file, file_mode): _ftell = f.tell() f.close() - if safe_file: # throw error if ftell > EOF + if safeio: # throw error if ftell > EOF assert throws(dill.loads, (f_dumped,), IOError) else: f2 = dill.loads(f_dumped) @@ -198,7 +198,7 @@ def test(safe_file, file_mode): f = open(fname, "a") f.write("hello") - f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) + f_dumped = dill.dumps(f, safeio=safeio, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() @@ -209,7 +209,7 @@ def test(safe_file, file_mode): _ftell = f.tell() f.close() - if safe_file: # throw error if ftell > EOF + if safeio: # throw error if ftell > EOF assert throws(dill.loads, (f_dumped,), IOError) else: f2 = dill.loads(f_dumped) @@ -239,14 +239,14 @@ def test(safe_file, file_mode): f = open(fname, "r") fstr = f.read() - f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) + f_dumped = dill.dumps(f, safeio=safeio, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() os.remove(fname) - if safe_file: # throw error if file DNE + if safeio: # throw error if file DNE assert throws(dill.loads, (f_dumped,), IOError) else: f2 = dill.loads(f_dumped) @@ -278,14 +278,14 @@ def test(safe_file, file_mode): f = open(fname, "w+") f.write("hello") - f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) + f_dumped = dill.dumps(f, safeio=safeio, file_mode=file_mode) ftell = f.tell() fmode = f.mode f.close() os.remove(fname) - if safe_file: # throw error if file DNE + if safeio: # throw error if file DNE assert throws(dill.loads, (f_dumped,), IOError) else: f2 = dill.loads(f_dumped) @@ -314,14 +314,14 @@ def test(safe_file, file_mode): f = open(fname, "a") f.write("hello") - f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) + f_dumped = dill.dumps(f, safeio=safeio, file_mode=file_mode) ftell = f.tell() fmode = f.mode f.close() os.remove(fname) - if safe_file: # throw error if file DNE + if safeio: # throw error if file DNE assert throws(dill.loads, (f_dumped,), IOError) else: f2 = dill.loads(f_dumped) @@ -349,7 +349,7 @@ def test(safe_file, file_mode): f = open(fname, "r") fstr = f.read() - f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) + f_dumped = dill.dumps(f, safeio=safeio, file_mode=file_mode) fmode = f.mode ftell = f.tell() f.close() @@ -384,7 +384,7 @@ def test(safe_file, file_mode): f = open(fname, "w") f.write("hello") - f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) + f_dumped = dill.dumps(f, safeio=safeio, file_mode=file_mode) fmode = f.mode ftell = f.tell() @@ -423,7 +423,7 @@ def test(safe_file, file_mode): f = open(fname, "a") f.write("hello") - f_dumped = dill.dumps(f, safe_file=safe_file, file_mode=file_mode) + f_dumped = dill.dumps(f, safeio=safeio, file_mode=file_mode) fmode = f.mode ftell = f.tell() fstr = open(fname).read() @@ -454,13 +454,13 @@ def test(safe_file, file_mode): f2.close() -test(safe_file=False, file_mode=dill.FMODE_NEWHANDLE) -test(safe_file=False, file_mode=dill.FMODE_PRESERVEDATA) -test(safe_file=False, file_mode=dill.FMODE_PICKLECONTENTS) +test(safeio=False, file_mode=dill.FMODE_NEWHANDLE) +test(safeio=False, file_mode=dill.FMODE_PRESERVEDATA) +test(safeio=False, file_mode=dill.FMODE_PICKLECONTENTS) -test(safe_file=True, file_mode=dill.FMODE_NEWHANDLE) -test(safe_file=True, file_mode=dill.FMODE_PRESERVEDATA) -test(safe_file=True, file_mode=dill.FMODE_PICKLECONTENTS) +test(safeio=True, file_mode=dill.FMODE_NEWHANDLE) +test(safeio=True, file_mode=dill.FMODE_PRESERVEDATA) +test(safeio=True, file_mode=dill.FMODE_PICKLECONTENTS) if os.path.exists(fname): os.remove(fname) From fb87158da8e99a2cf3d15d22cb9862b7effd5a9a Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Thu, 21 Aug 2014 10:24:00 +0100 Subject: [PATCH 42/77] Fix test failure (stupid mistake) --- dill/dill.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 201a6c7f..4cd528cc 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -676,8 +676,8 @@ def _save_file(pickler, obj, open_): else: fdata = "" pickler.save_reduce(_create_filehandle, (obj.name, obj.mode, position, - obj.closed, open_, pickler._safe_file, - pickler._safeio, fdata), obj=obj) + obj.closed, open_, pickler._safeio, + pickler._file_mode, fdata), obj=obj) return From ed19d4f560a549a658c7c49e59f6dd7bcf168383 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Thu, 21 Aug 2014 10:30:06 +0100 Subject: [PATCH 43/77] Fix 3.0 - 3.2 incompatibility in save_module --- dill/dill.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dill/dill.py b/dill/dill.py index 9600ad81..1649c730 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -805,7 +805,7 @@ def save_weakproxy(pickler, obj): def save_module(pickler, obj): # if a module file name starts with this, it should be a standard module, # so should be pickled as a reference - prefix = sys.base_prefix if PY3 else sys.prefix + prefix = getattr(sys, "base_prefix", sys.prefix) std_mod = getattr(obj, "__file__", prefix).startswith(prefix) if obj.__name__ not in ("builtins", "dill") \ and not std_mod or is_dill(pickler) and obj is pickler._main_module: From 0e857f48be4e97ecfeae6bfa98d1345729ce14eb Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Sun, 5 Oct 2014 17:23:59 +0100 Subject: [PATCH 44/77] Change memorise to diff, add turn on function and extra test --- dill/__init__.py | 4 +- dill/{memorise.py => diff.py} | 0 dill/dill.py | 68 ++++++++++++++++++++-------- tests/{test_memo.py => test_diff.py} | 54 +++++++++++----------- tests/test_module.py | 28 ++++++++++-- 5 files changed, 104 insertions(+), 50 deletions(-) rename dill/{memorise.py => diff.py} (100%) rename tests/{test_memo.py => test_diff.py} (53%) diff --git a/dill/__init__.py b/dill/__init__.py index d2671b65..f446279c 100644 --- a/dill/__init__.py +++ b/dill/__init__.py @@ -24,8 +24,8 @@ """ + __license__ from .dill import dump, dumps, load, loads, dump_session, load_session, \ - Pickler, Unpickler, register, copy, pickle, pickles, HIGHEST_PROTOCOL, \ - DEFAULT_PROTOCOL, PicklingError, UnpicklingError, \ + Pickler, Unpickler, register, copy, pickle, pickles, use_diff, \ + HIGHEST_PROTOCOL, DEFAULT_PROTOCOL, PicklingError, UnpicklingError, \ FMODE_NEWHANDLE, FMODE_PRESERVEDATA, FMODE_PICKLECONTENTS from . import source, temp, detect diff --git a/dill/memorise.py b/dill/diff.py similarity index 100% rename from dill/memorise.py rename to dill/diff.py diff --git a/dill/dill.py b/dill/dill.py index 563b34aa..e1def51d 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -16,7 +16,7 @@ """ __all__ = ['dump','dumps','load','loads','dump_session','load_session', 'Pickler','Unpickler','register','copy','pickle','pickles', - 'HIGHEST_PROTOCOL','DEFAULT_PROTOCOL', + 'use_diff', 'HIGHEST_PROTOCOL','DEFAULT_PROTOCOL', 'PicklingError','UnpicklingError','FMODE_NEWHANDLE', 'FMODE_PRESERVEDATA','FMODE_PICKLECONTENTS'] @@ -31,10 +31,8 @@ def _trace(boolean): import os import sys -try: - from . import memorise -except ImportError: - import memorise +diff = None +_use_diff = False PY3 = (hex(sys.hexversion) >= '0x30000f0') if PY3: #XXX: get types from dill.objtypes ? import builtins as __builtin__ @@ -257,7 +255,7 @@ class Pickler(StockPickler): def __init__(self, *args, **kwargs): StockPickler.__init__(self, *args, **kwargs) self._main_module = _main_module - self._memorise_cashe = {} + self._diff_cashe = {} class Unpickler(StockUnpickler): """python's Unpickler extended to interpreter sessions and more types""" @@ -300,6 +298,22 @@ def _revert_extension(): if type in pickle_dispatch_copy: StockPickler.dispatch[type] = pickle_dispatch_copy[type] +def use_diff(on=True): + """ + reduces size of pickles by only including object which have changed. + Decreases pickle size but increases CPU time needed. + Also helps avoid some unpicklable objects. + MUST be called at start of script, otherwise changes will not be recorded. + """ + global _use_diff, diff + _use_diff = on + if _use_diff and diff is None: + try: + from . import diff as d + except: + import diff as d + diff = d + def _create_typemap(): import types if PY3: @@ -890,20 +904,38 @@ def save_weakproxy(pickler, obj): @register(ModuleType) def save_module(pickler, obj): - if obj.__name__ != "dill": - try: - changed = memorise.whats_changed(obj, - seen=pickler._memorise_cashe)[0] - except RuntimeError: # not memorised module, probably part of dill - pass - else: + if _use_diff: + if obj.__name__ != "dill": + try: + changed = diff.whats_changed(obj, seen=pickler._diff_cashe)[0] + except RuntimeError: # not memorised module, probably part of dill + pass + else: + log.info("M1: %s with diff" % obj) + log.info("Diff: %s", changed.keys()) + pickler.save_reduce(_import_module, (obj.__name__,), obj=obj, + state=changed) + return + + log.info("M2: %s" % obj) + pickler.save_reduce(_import_module, (obj.__name__,), obj=obj) + else: + # if a module file name starts with prefx, it should be a builtin + # module, so should be pickled as a reference + prefix = getattr(sys, "base_prefix", sys.prefix) + std_mod = getattr(obj, "__file__", prefix).startswith(prefix) + if obj.__name__ not in ("builtins", "dill") \ + and not std_mod or is_dill(pickler) and obj is pickler._main_module: log.info("M1: %s" % obj) + _main_dict = obj.__dict__.copy() #XXX: better no copy? option to copy? + [_main_dict.pop(item, None) for item in singletontypes + + ["__builtins__", "__loader__"]] pickler.save_reduce(_import_module, (obj.__name__,), obj=obj, - state=changed) - return - - log.info("M2: %s" % obj) - pickler.save_reduce(_import_module, (obj.__name__,), obj=obj) + state=_main_dict) + else: + log.info("M2: %s" % obj) + pickler.save_reduce(_import_module, (obj.__name__,), obj=obj) + return return @register(TypeType) diff --git a/tests/test_memo.py b/tests/test_diff.py similarity index 53% rename from tests/test_memo.py rename to tests/test_diff.py index b4f37334..1b25e88e 100644 --- a/tests/test_memo.py +++ b/tests/test_diff.py @@ -5,7 +5,7 @@ # License: 3-clause BSD. The full license text is available at: # - http://trac.mystic.cacr.caltech.edu/project/pathos/browser/dill/LICENSE -from dill import memorise as m +from dill import diff class A: @@ -16,47 +16,47 @@ class A: c = A() a.a = b b.a = c -m.memorise(a) -assert not m.has_changed(a) +diff.memorise(a) +assert not diff.has_changed(a) c.a = 1 -assert m.has_changed(a) -m.memorise(c, force=True) -assert not m.has_changed(a) +assert diff.has_changed(a) +diff.memorise(c, force=True) +assert not diff.has_changed(a) c.a = 2 -assert m.has_changed(a) -changed = m.whats_changed(a) +assert diff.has_changed(a) +changed = diff.whats_changed(a) assert list(changed[0].keys()) == ["a"] assert not changed[1] a2 = [] b2 = [a2] c2 = [b2] -m.memorise(c2) -assert not m.has_changed(c2) +diff.memorise(c2) +assert not diff.has_changed(c2) a2.append(1) -assert m.has_changed(c2) -changed = m.whats_changed(c2) +assert diff.has_changed(c2) +changed = diff.whats_changed(c2) assert changed[0] == {} assert changed[1] a3 = {} b3 = {1: a3} c3 = {1: b3} -m.memorise(c3) -assert not m.has_changed(c3) +diff.memorise(c3) +assert not diff.has_changed(c3) a3[1] = 1 -assert m.has_changed(c3) -changed = m.whats_changed(c3) +assert diff.has_changed(c3) +changed = diff.whats_changed(c3) assert changed[0] == {} assert changed[1] import abc # make sure that the "_abc_invaldation_counter" does not cause the test to fail -m.memorise(abc.ABCMeta, force=True) -assert not m.has_changed(abc) +diff.memorise(abc.ABCMeta, force=True) +assert not diff.has_changed(abc) abc.ABCMeta.zzz = 1 -assert m.has_changed(abc) -changed = m.whats_changed(abc) +assert diff.has_changed(abc) +changed = diff.whats_changed(abc) assert list(changed[0].keys()) == ["ABCMeta"] assert not changed[1] @@ -66,14 +66,14 @@ class A: c = A() a.a = b b.a = c -m.memorise(a) -assert not m.has_changed(a) +diff.memorise(a) +assert not diff.has_changed(a) c.a = 1 -assert m.has_changed(a) -m.memorise(c, force=True) -assert not m.has_changed(a) +assert diff.has_changed(a) +diff.memorise(c, force=True) +assert not diff.has_changed(a) del c.a -assert m.has_changed(a) -changed = m.whats_changed(a) +assert diff.has_changed(a) +changed = diff.whats_changed(a) assert list(changed[0].keys()) == ["a"] assert not changed[1] diff --git a/tests/test_module.py b/tests/test_module.py index e3f2f621..69f52da5 100644 --- a/tests/test_module.py +++ b/tests/test_module.py @@ -8,9 +8,30 @@ import sys import dill import test_mixins as module +import imp cached = (module.__cached__ if hasattr(module, "__cached__") - else module.__file__ + "c") + else module.__file__.split(".", 1)[0] + ".pyc") + +module.a = 1234 + +pik_mod = dill.dumps(module) + +module.a = 0 + +# remove module +del sys.modules[module.__name__] +del module + +module = dill.loads(pik_mod) +assert hasattr(module, "a") and module.a == 1234 +assert module.double_add(1, 2, 3) == 2 * module.fx + +# Restart, and test use_diff + +imp.reload(module) + +dill.use_diff() module.a = 1234 @@ -29,5 +50,6 @@ # clean up import os os.remove(cached) -if os.path.exists("__pycache__") and not os.listdir("__pycache__"): - os.removedirs("__pycache__") +pycache = os.path.join(os.path.dirname(module.__file__), "__pycache__") +if os.path.exists(pycache) and not os.listdir(pycache): + os.removedirs(pycache) From 9c43af619ec0acc57848626a7e7665e7a0397f63 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Wed, 4 Mar 2015 17:16:46 +0000 Subject: [PATCH 45/77] Add support for properties --- dill/dill.py | 6 ++++++ tests/test_properties.py | 43 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 49 insertions(+) create mode 100644 tests/test_properties.py diff --git a/dill/dill.py b/dill/dill.py index f4a08228..470b6fe5 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -1000,6 +1000,12 @@ def save_type(pickler, obj): StockPickler.save_global(pickler, obj) return +@register(property) +def save_property(pickler, obj): + log.info("Pr: %s" % obj) + pickler.save_reduce(property, (obj.fget, obj.fset, obj.fdel, obj.__doc__), + obj=obj) + # quick sanity checking def pickles(obj,exact=False,safe=False,**kwds): """quick check if object pickles with dill""" diff --git a/tests/test_properties.py b/tests/test_properties.py new file mode 100644 index 00000000..078c9c2f --- /dev/null +++ b/tests/test_properties.py @@ -0,0 +1,43 @@ +#!/usr/bin/env python +# +# Author: Mike McKerns (mmckerns @caltech and @uqfoundation) +# Copyright (c) 2008-2015 California Institute of Technology. +# License: 3-clause BSD. The full license text is available at: +# - http://trac.mystic.cacr.caltech.edu/project/pathos/browser/dill/LICENSE + +import dill + + +class Foo(object): + def __init__(self): + self._data = 1 + + @property + def data(self): + return self._data + + @data.setter + def data(self, x): + self._data = x + +FooS = dill.copy(Foo) + +assert FooS.data.fget is not None +assert FooS.data.fset is not None +assert FooS.data.fdel is None + +try: + res = FooS().data +except Exception as e: + raise AssertionError(str(e)) +else: + assert res == 1 + +try: + f = FooS() + f.data = 1024 + res = f.data +except Exception as e: + raise AssertionError(str(e)) +else: + assert res == 1024 From 65a436203f41365857ca1cf542c3855247ca1dea Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Fri, 17 Apr 2015 15:49:27 +0100 Subject: [PATCH 46/77] Make sure metaclasses are handled by dill --- dill/dill.py | 18 ++++++++++++++++-- tests/test_classdef.py | 22 +++++++++++++++++++--- 2 files changed, 35 insertions(+), 5 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 470b6fe5..5a9c9f38 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -253,10 +253,24 @@ def load_session(filename='/tmp/session.pkl', main_module=_main_module): ### End: Pickle the Interpreter +class MetaCatchingDict(dict): + def get(self, key, default=None): + try: + return self[key] + except KeyError: + return default + + def __missing__(self, key): + if issubclass(key, type): + return save_type + else: + raise KeyError() + + ### Extend the Picklers class Pickler(StockPickler): """python's Pickler extended to interpreter sessions""" - dispatch = StockPickler.dispatch.copy() + dispatch = MetaCatchingDict(StockPickler.dispatch.copy()) _main_module = None _session = False _byref = False @@ -973,7 +987,7 @@ def save_type(pickler, obj): StockPickler.save_global(pickler, obj) return except AttributeError: pass - if type(obj) == type: + if issubclass(type(obj), type): # try: # used when pickling the class as code (or the interpreter) if is_dill(pickler) and not pickler._byref: # thanks to Tom Stepleton pointing out pickler._session unneeded diff --git a/tests/test_classdef.py b/tests/test_classdef.py index 99fd70e9..840bd727 100644 --- a/tests/test_classdef.py +++ b/tests/test_classdef.py @@ -6,6 +6,7 @@ # - http://trac.mystic.cacr.caltech.edu/project/pathos/browser/dill/LICENSE import dill +import sys # test classdefs class _class: @@ -32,13 +33,27 @@ def __call__(self): def ok(self): return True +class _meta(type): + pass + +def __call__(self): + pass +def ok(self): + return True + +_mclass = _meta("_mclass", (object,), {"__call__": __call__, "ok": ok}) + +del __call__ +del ok + o = _class() oc = _class2() n = _newclass() nc = _newclass2() +m = _mclass() -clslist = [_class,_class2,_newclass,_newclass2] -objlist = [o,oc,n,nc] +clslist = [_class,_class2,_newclass,_newclass2,_mclass] +objlist = [o,oc,n,nc,m] _clslist = [dill.dumps(obj) for obj in clslist] _objlist = [dill.dumps(obj) for obj in objlist] @@ -55,9 +70,10 @@ def ok(self): _obj = dill.loads(obj) assert _obj.ok() assert _cls.ok(_cls()) + if _cls.__name__ == "_mclass": + assert type(_cls).__name__ == "_meta" # test namedtuple -import sys if hex(sys.hexversion) >= '0x20600f0': from collections import namedtuple From 6f919b8b78a6fdfeff697249cc16682390c8277c Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Fri, 24 Apr 2015 18:40:19 +0100 Subject: [PATCH 47/77] Check for sys.real_prefix when serialising module --- dill/dill.py | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 470b6fe5..3031ca5a 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -944,8 +944,10 @@ def save_module(pickler, obj): else: # if a module file name starts with prefx, it should be a builtin # module, so should be pickled as a reference - prefix = getattr(sys, "base_prefix", sys.prefix) - std_mod = getattr(obj, "__file__", prefix).startswith(prefix) + base_prefix = getattr(sys, "base_prefix", sys.prefix) + real_prefix = getattr(sys, "real_prefix", base_prefix) + std_mod = (getattr(obj, "__file__", base_prefix).startswith(base_prefix) + or getattr(obj, "__file__", real_prefix).startswith(real_prefix)) if obj.__name__ not in ("builtins", "dill") \ and not std_mod or is_dill(pickler) and obj is pickler._main_module: log.info("M1: %s" % obj) From e8b625bcc17b29ea0b973269041f8fdf073cb5c8 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 26 May 2015 15:56:07 +0100 Subject: [PATCH 48/77] Check a few more prefixes for the builtin_mod test --- dill/dill.py | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 3031ca5a..07350600 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -944,12 +944,15 @@ def save_module(pickler, obj): else: # if a module file name starts with prefx, it should be a builtin # module, so should be pickled as a reference - base_prefix = getattr(sys, "base_prefix", sys.prefix) - real_prefix = getattr(sys, "real_prefix", base_prefix) - std_mod = (getattr(obj, "__file__", base_prefix).startswith(base_prefix) - or getattr(obj, "__file__", real_prefix).startswith(real_prefix)) + if hasattr(obj, "__file__"): + names = ["base_prefix", "base_exec_prefix", "exec_prefix", + "prefix", "real_prefix"] + builtin_mod = any([obj.__file__.startswith(getattr(sys, name)) + for name in names if hasattr(sys, name)]) + else: + builtin_mod = True if obj.__name__ not in ("builtins", "dill") \ - and not std_mod or is_dill(pickler) and obj is pickler._main_module: + and not builtin_mod or is_dill(pickler) and obj is pickler._main_module: log.info("M1: %s" % obj) _main_dict = obj.__dict__.copy() #XXX: better no copy? option to copy? [_main_dict.pop(item, None) for item in singletontypes From e6d3f1d242eb9d172edafd5bf66b817a2094bf34 Mon Sep 17 00:00:00 2001 From: James Laird-Wah Date: Fri, 29 May 2015 02:25:00 +1000 Subject: [PATCH 49/77] dump_session: add byref Setting byref=True will preprocess the module by groveling through its top-level objects for things that are identical to those in the other available modules. If it finds matches, it records the names and discards the objects during pickling. On load, it then recovers the objects by "import x as y" using the stored x, y. This fixes issues with nasty unpickleable things in the depths of libraries, as well as making the sessions smaller. Fixes #79 and most of #78. --- dill/dill.py | 48 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/dill/dill.py b/dill/dill.py index 882093cb..84371847 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -222,10 +222,55 @@ def loads(str): ### End: Shorthands ### ### Pickle the Interpreter Session -def dump_session(filename='/tmp/session.pkl', main_module=_main_module): +def _module_map(): + from collections import defaultdict + modmap = defaultdict(list) + items = 'items' if PY3 else 'iteritems' + for name, module in getattr(sys.modules, items)(): + if module is None: + continue + for objname, obj in module.__dict__.items(): + modmap[objname].append((obj, name)) + return modmap + +def _find_source_module(modmap, name, obj, main_module): + for modobj, modname in modmap[name]: + if modobj is obj and modname != main_module.__name__: + return modname + +def _split_module_imports(main_module): + modmap = _module_map() + imported = [] + original = {} + items = 'items' if PY3 else 'iteritems' + for name, obj in getattr(main_module.__dict__, items)(): + source_module = _find_source_module(modmap, name, obj, main_module) + if source_module: + imported.append((source_module, name)) + else: + original[name] = obj + if len(imported): + import types + newmod = types.ModuleType(main_module.__name__) + newmod.__dict__.update(original) + newmod.__dill_imported = imported + return newmod + else: + return original + +def _restore_module_imports(main_module): + if '__dill_imported' not in main_module.__dict__: + return + imports = main_module.__dict__.pop('__dill_imported') + for module, name in imports: + exec("from %s import %s" % (module, name), main_module.__dict__) + +def dump_session(filename='/tmp/session.pkl', main_module=_main_module, byref=False): """pickle the current state of __main__ to a file""" f = open(filename, 'wb') try: + if byref: + main_module = _split_module_imports(main_module) pickler = Pickler(f, 2) pickler._main_module = main_module _byref = pickler._byref @@ -248,6 +293,7 @@ def load_session(filename='/tmp/session.pkl', main_module=_main_module): module = unpickler.load() unpickler._session = False main_module.__dict__.update(module.__dict__) + _restore_module_imports(main_module) finally: f.close() return From bd18c69386a25daac97f968819118825c91d8ef1 Mon Sep 17 00:00:00 2001 From: Gabi Davar Date: Fri, 26 Jun 2015 10:29:36 +0300 Subject: [PATCH 50/77] fix some win32 issues: * handle lack of curses gracefully * CDLL(NULL) throws an exception in py27 windows. --- dill/_objects.py | 122 ++++++++++++++++++++++-------------------- tests/test_objects.py | 4 +- 2 files changed, 65 insertions(+), 61 deletions(-) diff --git a/dill/_objects.py b/dill/_objects.py index 77724b5c..24ad5c53 100644 --- a/dill/_objects.py +++ b/dill/_objects.py @@ -56,7 +56,6 @@ import os import logging import optparse -import curses #import __hello__ import threading import socket @@ -69,12 +68,8 @@ HAS_ALL = True except ImportError: # Ubuntu HAS_ALL = False -try: - import ctypes - HAS_CTYPES = True -except ImportError: # MacPorts - HAS_CTYPES = False -from curses import textpad, panel + +import ctypes # helper objects class _class: @@ -106,11 +101,11 @@ def _function2(): from sys import exc_info e, er, tb = exc_info() return er, tb -if HAS_CTYPES: - class _Struct(ctypes.Structure): - pass - _Struct._fields_ = [("_field", ctypes.c_int),("next", ctypes.POINTER(_Struct))] +class _Struct(ctypes.Structure): + pass +_Struct._fields_ = [("_field", ctypes.c_int),("next", ctypes.POINTER(_Struct))] _filedescrip, _tempfile = tempfile.mkstemp('r') # deleted in cleanup +os.close(_filedescrip) _tmpf = tempfile.TemporaryFile('w') # put the objects in order, if possible @@ -188,25 +183,24 @@ class _Struct(ctypes.Structure): a['OptionParserType'] = _oparser = optparse.OptionParser() # pickle ok a['OptionGroupType'] = optparse.OptionGroup(_oparser,"foo") # pickle ok a['OptionType'] = optparse.Option('--foo') # pickle ok -if HAS_CTYPES: - a['CCharType'] = _cchar = ctypes.c_char() - a['CWCharType'] = ctypes.c_wchar() # fail == 2.6 - a['CByteType'] = ctypes.c_byte() - a['CUByteType'] = ctypes.c_ubyte() - a['CShortType'] = ctypes.c_short() - a['CUShortType'] = ctypes.c_ushort() - a['CIntType'] = ctypes.c_int() - a['CUIntType'] = ctypes.c_uint() - a['CLongType'] = ctypes.c_long() - a['CULongType'] = ctypes.c_ulong() - a['CLongLongType'] = ctypes.c_longlong() - a['CULongLongType'] = ctypes.c_ulonglong() - a['CFloatType'] = ctypes.c_float() - a['CDoubleType'] = ctypes.c_double() - a['CSizeTType'] = ctypes.c_size_t() - a['CLibraryLoaderType'] = ctypes.cdll - a['StructureType'] = _Struct - a['BigEndianStructureType'] = ctypes.BigEndianStructure() +a['CCharType'] = _cchar = ctypes.c_char() +a['CWCharType'] = ctypes.c_wchar() # fail == 2.6 +a['CByteType'] = ctypes.c_byte() +a['CUByteType'] = ctypes.c_ubyte() +a['CShortType'] = ctypes.c_short() +a['CUShortType'] = ctypes.c_ushort() +a['CIntType'] = ctypes.c_int() +a['CUIntType'] = ctypes.c_uint() +a['CLongType'] = ctypes.c_long() +a['CULongType'] = ctypes.c_ulong() +a['CLongLongType'] = ctypes.c_longlong() +a['CULongLongType'] = ctypes.c_ulonglong() +a['CFloatType'] = ctypes.c_float() +a['CDoubleType'] = ctypes.c_double() +a['CSizeTType'] = ctypes.c_size_t() +a['CLibraryLoaderType'] = ctypes.cdll +a['StructureType'] = _Struct +a['BigEndianStructureType'] = ctypes.BigEndianStructure() #NOTE: also LittleEndianStructureType and UnionType... abstract classes #NOTE: remember for ctypesobj.contents creates a new python object #NOTE: ctypes.c_int._objects is memberdescriptor for object's __dict__ @@ -229,9 +223,8 @@ class _Struct(ctypes.Structure): a['BufferedIOBaseType'] = io.BufferedIOBase() a['UnicodeIOType'] = TextIO() # the new StringIO a['LoggingAdapterType'] = logging.LoggingAdapter(_logger,_dict) # pickle ok - if HAS_CTYPES: - a['CBoolType'] = ctypes.c_bool(1) - a['CLongDoubleType'] = ctypes.c_longdouble() + a['CBoolType'] = ctypes.c_bool(1) + a['CLongDoubleType'] = ctypes.c_longdouble() except ImportError: pass try: # python 2.7 @@ -239,8 +232,7 @@ class _Struct(ctypes.Structure): # data types (CH 8) a['OrderedDictType'] = collections.OrderedDict(_dict) a['CounterType'] = collections.Counter(_dict) - if HAS_CTYPES: - a['CSSizeTType'] = ctypes.c_ssize_t() + a['CSSizeTType'] = ctypes.c_ssize_t() # generic operating system services (CH 15) a['NullHandlerType'] = logging.NullHandler() # pickle ok # new 2.7 a['ArgParseFileType'] = argparse.FileType() # pickle ok @@ -287,16 +279,20 @@ class _Struct(ctypes.Structure): d['WrapperDescriptorType'] = type.__repr__ a['WrapperDescriptorType2'] = type.__dict__['__module__'] # built-in functions (CH 2) -if PY3: _methodwrap = (1).__lt__ -else: _methodwrap = (1).__cmp__ +if PY3: + _methodwrap = (1).__lt__ +else: + _methodwrap = (1).__cmp__ d['MethodWrapperType'] = _methodwrap a['StaticMethodType'] = staticmethod(_method) a['ClassMethodType'] = classmethod(_method) a['PropertyType'] = property() d['SuperType'] = super(Exception, _exception) # string services (CH 7) -if PY3: _in = _bytes -else: _in = _str +if PY3: + _in = _bytes +else: + _in = _str a['InputType'] = _cstrI = StringIO(_in) a['OutputType'] = _cstrO = StringIO() # data types (CH 8) @@ -468,27 +464,35 @@ class _Struct(ctypes.Structure): x['HashType'] = hashlib.md5() x['HMACType'] = hmac.new(_in) # generic operating system services (CH 15) -#x['CursesWindowType'] = _curwin = curses.initscr() #FIXME: messes up tty -#x['CursesTextPadType'] = textpad.Textbox(_curwin) -#x['CursesPanelType'] = panel.new_panel(_curwin) -if HAS_CTYPES: - x['CCharPType'] = ctypes.c_char_p() - x['CWCharPType'] = ctypes.c_wchar_p() - x['CVoidPType'] = ctypes.c_void_p() +try: + import curses + from curses import textpad, panel + #x['CursesWindowType'] = _curwin = curses.initscr() #FIXME: messes up tty + #x['CursesTextPadType'] = textpad.Textbox(_curwin) + #x['CursesPanelType'] = panel.new_panel(_curwin) +except ImportError: + pass + +x['CCharPType'] = ctypes.c_char_p() +x['CWCharPType'] = ctypes.c_wchar_p() +x['CVoidPType'] = ctypes.c_void_p() +if sys.platform == 'win32': + x['CDLLType'] = _cdll = ctypes.cdll.msvcrt +else: x['CDLLType'] = _cdll = ctypes.CDLL(None) - x['PyDLLType'] = _pydll = ctypes.pythonapi - x['FuncPtrType'] = _cdll._FuncPtr() - x['CCharArrayType'] = ctypes.create_string_buffer(1) - x['CWCharArrayType'] = ctypes.create_unicode_buffer(1) - x['CParamType'] = ctypes.byref(_cchar) - x['LPCCharType'] = ctypes.pointer(_cchar) - x['LPCCharObjType'] = _lpchar = ctypes.POINTER(ctypes.c_char) - x['NullPtrType'] = _lpchar() - x['NullPyObjectType'] = ctypes.py_object() - x['PyObjectType'] = ctypes.py_object(1) - x['FieldType'] = _field = _Struct._field - x['CFUNCTYPEType'] = _cfunc = ctypes.CFUNCTYPE(ctypes.c_char) - x['CFunctionType'] = _cfunc(str) +x['PyDLLType'] = _pydll = ctypes.pythonapi +x['FuncPtrType'] = _cdll._FuncPtr() +x['CCharArrayType'] = ctypes.create_string_buffer(1) +x['CWCharArrayType'] = ctypes.create_unicode_buffer(1) +x['CParamType'] = ctypes.byref(_cchar) +x['LPCCharType'] = ctypes.pointer(_cchar) +x['LPCCharObjType'] = _lpchar = ctypes.POINTER(ctypes.c_char) +x['NullPtrType'] = _lpchar() +x['NullPyObjectType'] = ctypes.py_object() +x['PyObjectType'] = ctypes.py_object(1) +x['FieldType'] = _field = _Struct._field +x['CFUNCTYPEType'] = _cfunc = ctypes.CFUNCTYPE(ctypes.c_char) +x['CFunctionType'] = _cfunc(str) try: # python 2.6 # numeric and mathematical types (CH 9) x['MethodCallerType'] = operator.methodcaller('mro') # 2.6 diff --git a/tests/test_objects.py b/tests/test_objects.py index f9cdbc19..fee68b81 100644 --- a/tests/test_objects.py +++ b/tests/test_objects.py @@ -15,15 +15,15 @@ #import pickle # get all objects for testing -from dill import load_types +from dill import load_types, objects load_types(pickleable=True,unpickleable=False) #load_types(pickleable=True,unpickleable=True) -from dill import objects # helper objects class _class: def _method(self): pass + # objects that *fail* if imported special = {} special['LambdaType'] = _lambda = lambda x: lambda y: x From 3b641f07effb0cd8e6e1c37ab64021d496916447 Mon Sep 17 00:00:00 2001 From: Gabi Davar Date: Fri, 26 Jun 2015 15:10:51 +0300 Subject: [PATCH 51/77] revert ctypes assumption fix setup error on windows --- dill/_objects.py | 102 +++++++++++++++++++++++++---------------------- setup.py | 2 +- 2 files changed, 56 insertions(+), 48 deletions(-) diff --git a/dill/_objects.py b/dill/_objects.py index 24ad5c53..bd3b0170 100644 --- a/dill/_objects.py +++ b/dill/_objects.py @@ -68,8 +68,11 @@ HAS_ALL = True except ImportError: # Ubuntu HAS_ALL = False - -import ctypes +try: + import ctypes + HAS_CTYPES = True +except ImportError: # MacPorts + HAS_CTYPES = False # helper objects class _class: @@ -101,9 +104,10 @@ def _function2(): from sys import exc_info e, er, tb = exc_info() return er, tb -class _Struct(ctypes.Structure): - pass -_Struct._fields_ = [("_field", ctypes.c_int),("next", ctypes.POINTER(_Struct))] +if HAS_CTYPES: + class _Struct(ctypes.Structure): + pass + _Struct._fields_ = [("_field", ctypes.c_int),("next", ctypes.POINTER(_Struct))] _filedescrip, _tempfile = tempfile.mkstemp('r') # deleted in cleanup os.close(_filedescrip) _tmpf = tempfile.TemporaryFile('w') @@ -183,24 +187,25 @@ class _Struct(ctypes.Structure): a['OptionParserType'] = _oparser = optparse.OptionParser() # pickle ok a['OptionGroupType'] = optparse.OptionGroup(_oparser,"foo") # pickle ok a['OptionType'] = optparse.Option('--foo') # pickle ok -a['CCharType'] = _cchar = ctypes.c_char() -a['CWCharType'] = ctypes.c_wchar() # fail == 2.6 -a['CByteType'] = ctypes.c_byte() -a['CUByteType'] = ctypes.c_ubyte() -a['CShortType'] = ctypes.c_short() -a['CUShortType'] = ctypes.c_ushort() -a['CIntType'] = ctypes.c_int() -a['CUIntType'] = ctypes.c_uint() -a['CLongType'] = ctypes.c_long() -a['CULongType'] = ctypes.c_ulong() -a['CLongLongType'] = ctypes.c_longlong() -a['CULongLongType'] = ctypes.c_ulonglong() -a['CFloatType'] = ctypes.c_float() -a['CDoubleType'] = ctypes.c_double() -a['CSizeTType'] = ctypes.c_size_t() -a['CLibraryLoaderType'] = ctypes.cdll -a['StructureType'] = _Struct -a['BigEndianStructureType'] = ctypes.BigEndianStructure() +if HAS_CTYPES: + a['CCharType'] = _cchar = ctypes.c_char() + a['CWCharType'] = ctypes.c_wchar() # fail == 2.6 + a['CByteType'] = ctypes.c_byte() + a['CUByteType'] = ctypes.c_ubyte() + a['CShortType'] = ctypes.c_short() + a['CUShortType'] = ctypes.c_ushort() + a['CIntType'] = ctypes.c_int() + a['CUIntType'] = ctypes.c_uint() + a['CLongType'] = ctypes.c_long() + a['CULongType'] = ctypes.c_ulong() + a['CLongLongType'] = ctypes.c_longlong() + a['CULongLongType'] = ctypes.c_ulonglong() + a['CFloatType'] = ctypes.c_float() + a['CDoubleType'] = ctypes.c_double() + a['CSizeTType'] = ctypes.c_size_t() + a['CLibraryLoaderType'] = ctypes.cdll + a['StructureType'] = _Struct + a['BigEndianStructureType'] = ctypes.BigEndianStructure() #NOTE: also LittleEndianStructureType and UnionType... abstract classes #NOTE: remember for ctypesobj.contents creates a new python object #NOTE: ctypes.c_int._objects is memberdescriptor for object's __dict__ @@ -223,8 +228,9 @@ class _Struct(ctypes.Structure): a['BufferedIOBaseType'] = io.BufferedIOBase() a['UnicodeIOType'] = TextIO() # the new StringIO a['LoggingAdapterType'] = logging.LoggingAdapter(_logger,_dict) # pickle ok - a['CBoolType'] = ctypes.c_bool(1) - a['CLongDoubleType'] = ctypes.c_longdouble() + if HAS_CTYPES: + a['CBoolType'] = ctypes.c_bool(1) + a['CLongDoubleType'] = ctypes.c_longdouble() except ImportError: pass try: # python 2.7 @@ -232,7 +238,8 @@ class _Struct(ctypes.Structure): # data types (CH 8) a['OrderedDictType'] = collections.OrderedDict(_dict) a['CounterType'] = collections.Counter(_dict) - a['CSSizeTType'] = ctypes.c_ssize_t() + if HAS_CTYPES: + a['CSSizeTType'] = ctypes.c_ssize_t() # generic operating system services (CH 15) a['NullHandlerType'] = logging.NullHandler() # pickle ok # new 2.7 a['ArgParseFileType'] = argparse.FileType() # pickle ok @@ -472,27 +479,28 @@ class _Struct(ctypes.Structure): #x['CursesPanelType'] = panel.new_panel(_curwin) except ImportError: pass - -x['CCharPType'] = ctypes.c_char_p() -x['CWCharPType'] = ctypes.c_wchar_p() -x['CVoidPType'] = ctypes.c_void_p() -if sys.platform == 'win32': - x['CDLLType'] = _cdll = ctypes.cdll.msvcrt -else: - x['CDLLType'] = _cdll = ctypes.CDLL(None) -x['PyDLLType'] = _pydll = ctypes.pythonapi -x['FuncPtrType'] = _cdll._FuncPtr() -x['CCharArrayType'] = ctypes.create_string_buffer(1) -x['CWCharArrayType'] = ctypes.create_unicode_buffer(1) -x['CParamType'] = ctypes.byref(_cchar) -x['LPCCharType'] = ctypes.pointer(_cchar) -x['LPCCharObjType'] = _lpchar = ctypes.POINTER(ctypes.c_char) -x['NullPtrType'] = _lpchar() -x['NullPyObjectType'] = ctypes.py_object() -x['PyObjectType'] = ctypes.py_object(1) -x['FieldType'] = _field = _Struct._field -x['CFUNCTYPEType'] = _cfunc = ctypes.CFUNCTYPE(ctypes.c_char) -x['CFunctionType'] = _cfunc(str) + +if HAS_CTYPES: + x['CCharPType'] = ctypes.c_char_p() + x['CWCharPType'] = ctypes.c_wchar_p() + x['CVoidPType'] = ctypes.c_void_p() + if sys.platform == 'win32': + x['CDLLType'] = _cdll = ctypes.cdll.msvcrt + else: + x['CDLLType'] = _cdll = ctypes.CDLL(None) + x['PyDLLType'] = _pydll = ctypes.pythonapi + x['FuncPtrType'] = _cdll._FuncPtr() + x['CCharArrayType'] = ctypes.create_string_buffer(1) + x['CWCharArrayType'] = ctypes.create_unicode_buffer(1) + x['CParamType'] = ctypes.byref(_cchar) + x['LPCCharType'] = ctypes.pointer(_cchar) + x['LPCCharObjType'] = _lpchar = ctypes.POINTER(ctypes.c_char) + x['NullPtrType'] = _lpchar() + x['NullPyObjectType'] = ctypes.py_object() + x['PyObjectType'] = ctypes.py_object(1) + x['FieldType'] = _field = _Struct._field + x['CFUNCTYPEType'] = _cfunc = ctypes.CFUNCTYPE(ctypes.c_char) + x['CFunctionType'] = _cfunc(str) try: # python 2.6 # numeric and mathematical types (CH 9) x['MethodCallerType'] = operator.methodcaller('mro') # 2.6 diff --git a/setup.py b/setup.py index 0a07ec6d..db42d628 100644 --- a/setup.py +++ b/setup.py @@ -275,7 +275,7 @@ def write_info_py(filename='dill/info.py'): if sys.platform[:3] == 'win': setup_code += """ install_requires = ['pyreadline%s'], -""" % (pyreadline_version) +""" % (pyreadline) # verrrry unlikely that this is still relevant elif hex(sys.hexversion) < '0x20500f0': setup_code += """ From 3430f0a195888458d485273069a19549ea91bcb8 Mon Sep 17 00:00:00 2001 From: Gabi Davar Date: Fri, 26 Jun 2015 15:59:39 +0300 Subject: [PATCH 52/77] cosmetics --- dill/_objects.py | 3 ++- setup.py | 6 +++--- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/dill/_objects.py b/dill/_objects.py index bd3b0170..670612f6 100644 --- a/dill/_objects.py +++ b/dill/_objects.py @@ -109,7 +109,6 @@ class _Struct(ctypes.Structure): pass _Struct._fields_ = [("_field", ctypes.c_int),("next", ctypes.POINTER(_Struct))] _filedescrip, _tempfile = tempfile.mkstemp('r') # deleted in cleanup -os.close(_filedescrip) _tmpf = tempfile.TemporaryFile('w') # put the objects in order, if possible @@ -536,6 +535,8 @@ class _Struct(ctypes.Structure): # -- cleanup ---------------------------------------------------------------- a.update(d) # registered also succeed + +os.close(_filedescrip) # required on win32 os.remove(_tempfile) diff --git a/setup.py b/setup.py index db42d628..fe710501 100644 --- a/setup.py +++ b/setup.py @@ -265,8 +265,8 @@ def write_info_py(filename='dill/info.py'): # add dependencies ctypes_version = '>=1.0.1' -objgraph = '>=1.7.2' -pyreadline = '>=1.7.1' +objgraph_version = '>=1.7.2' +pyreadline_version = '>=1.7.1' import sys if has_setuptools: setup_code += """ @@ -275,7 +275,7 @@ def write_info_py(filename='dill/info.py'): if sys.platform[:3] == 'win': setup_code += """ install_requires = ['pyreadline%s'], -""" % (pyreadline) +""" % (pyreadline_version) # verrrry unlikely that this is still relevant elif hex(sys.hexversion) < '0x20500f0': setup_code += """ From 83df5f41bc8f89808e0a16e922b1488f613885c2 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Thu, 30 Jul 2015 15:47:00 +0100 Subject: [PATCH 53/77] Handle cases where os.path.exists throws --- dill/dill.py | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 30acf3b1..5f9a6b35 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -494,8 +494,11 @@ def _create_filehandle(name, mode, position, closed, open, strictio, fmode, fdat # treat x mode as w mode if "x" in mode and sys.hexversion < 0x03030000: raise ValueError("invalid mode: '%s'" % mode) - - if not os.path.exists(name): + try: + exists = os.path.exists(name) + except: + exists = False + if not exists: if strictio: raise FileNotFoundError("[Errno 2] No such file or directory: '%s'" % name) elif "r" in mode and fmode != FILE_FMODE: @@ -537,6 +540,8 @@ def _create_filehandle(name, mode, position, closed, open, strictio, fmode, fdat r = getattr(r, "raw", r) r.name = name else: + if not HAS_CTYPES: + raise ImportError("No module named 'ctypes'") class FILE(ctypes.Structure): _fields_ = [("refcount", ctypes.c_long), ("type_obj", ctypes.py_object), @@ -548,8 +553,6 @@ class PyObject(ctypes.Structure): ("ob_refcnt", ctypes.c_int), ("ob_type", ctypes.py_object) ] - if not HAS_CTYPES: - raise ImportError("No module named 'ctypes'") ctypes.cast(id(f), ctypes.POINTER(FILE)).contents.name = name ctypes.cast(id(name), ctypes.POINTER(PyObject)).contents.ob_refcnt += 1 assert f.name == name From 047a253b30355932e4ed2f22f796f4f991ccd69c Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Thu, 20 Aug 2015 11:05:57 +0100 Subject: [PATCH 54/77] Add handling for classmethods and staticmethods --- dill/dill.py | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/dill/dill.py b/dill/dill.py index dea04215..7bdf4ecf 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -1108,6 +1108,18 @@ def save_property(pickler, obj): pickler.save_reduce(property, (obj.fget, obj.fset, obj.fdel, obj.__doc__), obj=obj) +@register(staticmethod) +@register(classmethod) +def save_classmethod(pickler, obj): + log.info("Cm: %s" % obj) + try: + orig_func = obj.__func__ + except AttributeError: # Python 2.6 + orig_func = obj.__get__(None, object) + if isinstance(obj, classmethod): + orig_func = orig_func.__func__ # Unbind + pickler.save_reduce(type(obj), (orig_func,), obj=obj) + # quick sanity checking def pickles(obj,exact=False,safe=False,**kwds): """quick check if object pickles with dill""" From ade1bd0cea2613df437ab221995cf6f9be50aff5 Mon Sep 17 00:00:00 2001 From: Futrell Date: Sat, 3 Oct 2015 17:17:39 -0400 Subject: [PATCH 55/77] allow pickling functions in pypy --- dill/dill.py | 72 ++++++++++++++++++++++++++-------------------------- 1 file changed, 36 insertions(+), 36 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 5c7c9bd3..5ef080bd 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -698,42 +698,6 @@ def save_code(pickler, obj): log.info("# Co") return -@register(FunctionType) -def save_function(pickler, obj): - if not _locate_function(obj): #, pickler._session): - log.info("F1: %s" % obj) - if getattr(pickler, '_recurse', False): - # recurse to get all globals referred to by obj - from .detect import globalvars - globs = globalvars(obj, recurse=True, builtin=True) - # remove objects that have already been serialized - #stacktypes = (ClassType, TypeType, FunctionType) - #for key,value in list(globs.items()): - # if isinstance(value, stacktypes) and value in stack: - # del globs[key] - # ABORT: if self-references, use _recurse=False - if obj in globs.values(): # or obj in stack: - globs = obj.__globals__ if PY3 else obj.func_globals - else: - globs = obj.__globals__ if PY3 else obj.func_globals - #stack.add(obj) - if PY3: - pickler.save_reduce(_create_function, (obj.__code__, - globs, obj.__name__, - obj.__defaults__, obj.__closure__, - obj.__dict__), obj=obj) - else: - pickler.save_reduce(_create_function, (obj.func_code, - globs, obj.func_name, - obj.func_defaults, obj.func_closure, - obj.__dict__), obj=obj) - log.info("# F1") - else: - log.info("F2: %s" % obj) - StockPickler.save_global(pickler, obj) #NOTE: also takes name=... - log.info("# F2") - return - @register(dict) def save_module_dict(pickler, obj): if is_dill(pickler) and obj == pickler._main.__dict__ and not pickler._session: @@ -1187,6 +1151,42 @@ def save_classmethod(pickler, obj): pickler.save_reduce(type(obj), (orig_func,), obj=obj) log.info("# Cm") +@register(FunctionType) +def save_function(pickler, obj): + if not _locate_function(obj): #, pickler._session): + log.info("F1: %s" % obj) + if getattr(pickler, '_recurse', False): + # recurse to get all globals referred to by obj + from .detect import globalvars + globs = globalvars(obj, recurse=True, builtin=True) + # remove objects that have already been serialized + #stacktypes = (ClassType, TypeType, FunctionType) + #for key,value in list(globs.items()): + # if isinstance(value, stacktypes) and value in stack: + # del globs[key] + # ABORT: if self-references, use _recurse=False + if obj in globs.values(): # or obj in stack: + globs = obj.__globals__ if PY3 else obj.func_globals + else: + globs = obj.__globals__ if PY3 else obj.func_globals + #stack.add(obj) + if PY3: + pickler.save_reduce(_create_function, (obj.__code__, + globs, obj.__name__, + obj.__defaults__, obj.__closure__, + obj.__dict__), obj=obj) + else: + pickler.save_reduce(_create_function, (obj.func_code, + globs, obj.func_name, + obj.func_defaults, obj.func_closure, + obj.__dict__), obj=obj) + log.info("# F1") + else: + log.info("F2: %s" % obj) + StockPickler.save_global(pickler, obj) #NOTE: also takes name=... + log.info("# F2") + return + # quick sanity checking def pickles(obj,exact=False,safe=False,**kwds): """quick check if object pickles with dill""" From 132234430f2d34e06734c48ecd6f1fbab0176123 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Wed, 23 Dec 2015 21:54:18 +0000 Subject: [PATCH 56/77] Only flush if file is open --- dill/dill.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dill/dill.py b/dill/dill.py index 0822a6c8..daca4760 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -881,10 +881,10 @@ def save_attrgetter(pickler, obj): return def _save_file(pickler, obj, open_): - obj.flush() if obj.closed: position = None else: + obj.flush() if obj in (sys.__stdout__, sys.__stderr__, sys.__stdin__): position = -1 else: From aa1355aa65a961dfd863a895b7d584217e0af306 Mon Sep 17 00:00:00 2001 From: Jason Myers Date: Thu, 28 Jan 2016 17:38:29 -0500 Subject: [PATCH 57/77] Fix for CellTypes when ctypes is missing --- dill/dill.py | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 83699cda..47aeeef9 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -665,6 +665,10 @@ def _create_cell(contents): except AttributeError: IS_PYPY = False +else: + def _create_cell(contents): + return (lambda x: lambda: x)(contents).func_closure[0] + def _create_weakref(obj, *args): from weakref import ref if obj is None: # it's dead @@ -1036,13 +1040,12 @@ def save_wrapper_descriptor(pickler, obj): log.info("# Wr") return -if HAS_CTYPES and IS_PYPY: - @register(CellType) - def save_cell(pickler, obj): - log.info("Ce: %s" % obj) - pickler.save_reduce(_create_cell, (obj.cell_contents,), obj=obj) - log.info("# Ce") - return +@register(CellType) +def save_cell(pickler, obj): + log.info("Ce: %s" % obj) + pickler.save_reduce(_create_cell, (obj.cell_contents,), obj=obj) + log.info("# Ce") + return # The following function is based on 'saveDictProxy' from spickle # Copyright (c) 2011 by science+computing ag From e48f23997a7a6a37b481e56e3f9a5d4075efc568 Mon Sep 17 00:00:00 2001 From: Jason Myers Date: Thu, 28 Jan 2016 19:53:08 -0500 Subject: [PATCH 58/77] PY3 support for non-ctypes cell creation --- dill/dill.py | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index 47aeeef9..28205446 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -133,11 +133,15 @@ def numpyufunc(obj): def ndarraysubclassinstance(obj): return False def numpyufunc(obj): return False -# make sure to add these 'hand-built' types to _typemap if PY3: - CellType = type((lambda x: lambda y: x)(0).__closure__[0]) + def _create_cell(contents): + return (lambda: contents).__closure__[0] else: - CellType = type((lambda x: lambda y: x)(0).func_closure[0]) + def _create_cell(contents): + return (lambda: contents).func_closure[0] + +# make sure to add these 'hand-built' types to _typemap +CellType = type(_create_cell(0)) WrapperDescriptorType = type(type.__repr__) MethodDescriptorType = type(type.__dict__['mro']) MethodWrapperType = type([].__repr__) @@ -665,10 +669,6 @@ def _create_cell(contents): except AttributeError: IS_PYPY = False -else: - def _create_cell(contents): - return (lambda x: lambda: x)(contents).func_closure[0] - def _create_weakref(obj, *args): from weakref import ref if obj is None: # it's dead From ef454b3398a6d6570fb186c4d9e67c5b43cce4cf Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Tue, 2 Feb 2016 20:57:05 +0000 Subject: [PATCH 59/77] Fix pickling of __main__ classes with __slots__ --- dill/dill.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/dill/dill.py b/dill/dill.py index a6261df2..83169a93 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -1214,6 +1214,8 @@ def save_type(pickler, obj): #print (_dict) #print ("%s\n%s" % (type(obj), obj.__name__)) #print ("%s\n%s" % (obj.__bases__, obj.__dict__)) + for name in _dict.get("__slots__", []): + del _dict[name] pickler.save_reduce(_create_type, (type(obj), obj.__name__, obj.__bases__, _dict), obj=obj) log.info("# %s" % _t) From ab2ac17491173531dc980498fb5b6c3c078b1510 Mon Sep 17 00:00:00 2001 From: Robert Bradshaw Date: Wed, 2 Mar 2016 17:39:49 -0800 Subject: [PATCH 60/77] Update dill.py --- dill/dill.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dill/dill.py b/dill/dill.py index a7aca4fc..66667d9c 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -964,7 +964,7 @@ def save_functor(pickler, obj): return @register(SuperType) -def save_functor(pickler, obj): +def save_super(pickler, obj): log.info("Su: %s" % obj) pickler.save_reduce(super, (obj.__thisclass__, obj.__self__), obj=obj) log.info("# Su") From f542feccb943d64eaaf2875b3855c85e68959622 Mon Sep 17 00:00:00 2001 From: Nick White Date: Fri, 13 May 2016 11:09:29 +0100 Subject: [PATCH 61/77] Normalize module paths when checking prefixes --- dill/dill.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index df23665a..85944835 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -1139,12 +1139,12 @@ def save_module(pickler, obj): pickler.save_reduce(_import_module, (obj.__name__,), obj=obj) log.info("# M2") else: - # if a module file name starts with prefx, it should be a builtin + # if a module file name starts with prefix, it should be a builtin # module, so should be pickled as a reference if hasattr(obj, "__file__"): names = ["base_prefix", "base_exec_prefix", "exec_prefix", "prefix", "real_prefix"] - builtin_mod = any([obj.__file__.startswith(getattr(sys, name)) + builtin_mod = any([obj.__file__.startswith(os.path.normpath(getattr(sys, name))) for name in names if hasattr(sys, name)]) builtin_mod = builtin_mod or 'site-packages' in obj.__file__ else: From da93e9777b44769ced6c2d710c7de9f0585a2e02 Mon Sep 17 00:00:00 2001 From: Kernc Date: Fri, 18 Nov 2016 13:22:12 +0100 Subject: [PATCH 62/77] Enable Travis CI Closes https://github.com/uqfoundation/dill/issues/3 --- .travis.yml | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 .travis.yml diff --git a/.travis.yml b/.travis.yml new file mode 100644 index 00000000..69c3b89b --- /dev/null +++ b/.travis.yml @@ -0,0 +1,28 @@ +language: python + +sudo: false + +matrix: + include: + - python: '2.7' + - python: '3.4' + - python: '3.5' + - python: '3.6-dev' + - python: 'nightly' + allow_failures: + - python: '3.5' + - python: '3.6-dev' + - python: 'nightly' + fast_finish: true + +cache: + pip: true + +before_install: + - set -e # fail on any error + +install: + - python setup.py build && python setup.py install + +script: + - for test in tests/*.py; do echo $test ; python $test > /dev/null ; done From ead3f958fdda38084993bfc412ea24fe07a838e5 Mon Sep 17 00:00:00 2001 From: Kernc Date: Fri, 18 Nov 2016 10:33:30 +0100 Subject: [PATCH 63/77] On Py3, use cPickle Unpickler Pure-Python unpickler, available as _Unpickler, uses `struct` to unpack data and the latter can also fail with `struct.error`. The default C Unpickler fails with documented and expectable UnpicklingError, and is also faster. --- dill/dill.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dill/dill.py b/dill/dill.py index 41784333..91f31f5a 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -37,7 +37,7 @@ def _trace(boolean): PY3 = (sys.hexversion >= 0x30000f0) if PY3: #XXX: get types from .objtypes ? import builtins as __builtin__ - from pickle import _Pickler as StockPickler, _Unpickler as StockUnpickler + from pickle import _Pickler as StockPickler, Unpickler as StockUnpickler from _thread import LockType #from io import IOBase from types import CodeType, FunctionType, MethodType, GeneratorType, \ From 19c21622105842d40e9706691882588be1d1f2d3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miguel=20S=C3=A1nchez=20de=20Le=C3=B3n=20Peque?= Date: Wed, 28 Dec 2016 22:39:58 +0100 Subject: [PATCH 64/77] Initial tox support --- .gitignore | 4 ++++ MANIFEST.in | 1 + tests/__init__.py | 0 tox.ini | 17 +++++++++++++++++ 4 files changed, 22 insertions(+) create mode 100644 .gitignore create mode 100644 MANIFEST.in create mode 100644 tests/__init__.py create mode 100644 tox.ini diff --git a/.gitignore b/.gitignore new file mode 100644 index 00000000..a1eb9281 --- /dev/null +++ b/.gitignore @@ -0,0 +1,4 @@ +.tox/ +.cache/ +*.egg-info/ +*.pyc diff --git a/MANIFEST.in b/MANIFEST.in new file mode 100644 index 00000000..1aba38f6 --- /dev/null +++ b/MANIFEST.in @@ -0,0 +1 @@ +include LICENSE diff --git a/tests/__init__.py b/tests/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/tox.ini b/tox.ini new file mode 100644 index 00000000..31b6ecc3 --- /dev/null +++ b/tox.ini @@ -0,0 +1,17 @@ +[tox] +envlist = + py36 + py35 + py34 + py33 + py27 + py26 + pypy3 + pypy + +[testenv] +whitelist_externals = + bash +commands = + bash -c "failed=0; for file in tests/*.py; do echo $file; \ + python $file || failed=1; done; exit $failed" From b049e984005208bcba3ec35300e6bbb2c28ae74a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miguel=20S=C3=A1nchez=20de=20Le=C3=B3n=20Peque?= Date: Wed, 28 Dec 2016 23:16:11 +0100 Subject: [PATCH 65/77] Make test_check.py pytest friendly --- tests/test_check.py | 51 ++++++++++++++++++++++++++++++++------------- 1 file changed, 36 insertions(+), 15 deletions(-) diff --git a/tests/test_check.py b/tests/test_check.py index e357dbc7..4b3bad73 100644 --- a/tests/test_check.py +++ b/tests/test_check.py @@ -8,14 +8,14 @@ from __future__ import with_statement from dill import check +import sys + from dill.temp import capture from dill.dill import PY3 -import sys -f = lambda x:x**2 #FIXME: this doesn't catch output... it's from the internal call -def test(func, **kwds): +def raise_check(func, **kwds): try: with capture('stdout') as out: check(func, **kwds) @@ -28,19 +28,40 @@ def test(func, **kwds): out.close() -if __name__ == '__main__': - test(f) - test(f, recurse=True) - test(f, byref=True) - test(f, protocol=0) - #TODO: test incompatible versions - # SyntaxError: invalid syntax +f = lambda x:x**2 + + +def test_simple(): + raise_check(f) + + +def test_recurse(): + raise_check(f, recurse=True) + + +def test_byref(): + raise_check(f, byref=True) + + +def test_protocol(): + raise_check(f, protocol=True) + + +def test_python(): if PY3: - test(f, python='python3.4') + raise_check(f, python='python3.4') else: - test(f, python='python2.7') - #TODO: test dump failure - #TODO: test load failure + raise_check(f, python='python2.7') + +#TODO: test incompatible versions +#TODO: test dump failure +#TODO: test load failure -# EOF + +if __name__ == '__main__': + test_simple() + test_recurse() + test_byref() + test_protocol() + test_python() From a916af016639dd4d00db74ca6a5e6801b831ddcb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miguel=20S=C3=A1nchez=20de=20Le=C3=B3n=20Peque?= Date: Wed, 28 Dec 2016 23:17:51 +0100 Subject: [PATCH 66/77] Make test_extendpickle.py pytest-friendly --- tests/test_extendpickle.py | 27 +++++++++++++++++---------- 1 file changed, 17 insertions(+), 10 deletions(-) diff --git a/tests/test_extendpickle.py b/tests/test_extendpickle.py index 553b6653..59d84400 100644 --- a/tests/test_extendpickle.py +++ b/tests/test_extendpickle.py @@ -12,20 +12,27 @@ except ImportError: from io import BytesIO as StringIO + def my_fn(x): return x * 17 -obj = lambda : my_fn(34) -assert obj() == 578 -obj_io = StringIO() -pickler = pickle.Pickler(obj_io) -pickler.dump(obj) +def test_extend(): + obj = lambda : my_fn(34) + assert obj() == 578 + + obj_io = StringIO() + pickler = pickle.Pickler(obj_io) + pickler.dump(obj) + + obj_str = obj_io.getvalue() + + obj2_io = StringIO(obj_str) + unpickler = pickle.Unpickler(obj2_io) + obj2 = unpickler.load() -obj_str = obj_io.getvalue() + assert obj2() == 578 -obj2_io = StringIO(obj_str) -unpickler = pickle.Unpickler(obj2_io) -obj2 = unpickler.load() -assert obj2() == 578 +if __name__ == '__main__': + test_extend() From d1f415b853291a4f82a04daf9d15e2064b75b238 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miguel=20S=C3=A1nchez=20de=20Le=C3=B3n=20Peque?= Date: Wed, 28 Dec 2016 23:20:16 +0100 Subject: [PATCH 67/77] Make test_functors.py pytest-friendly --- tests/test_functors.py | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/tests/test_functors.py b/tests/test_functors.py index 1d408294..5d70f756 100644 --- a/tests/test_functors.py +++ b/tests/test_functors.py @@ -10,21 +10,30 @@ import dill dill.settings['recurse'] = True + def f(a, b, c): # without keywords pass + def g(a, b, c=2): # with keywords pass + def h(a=1, b=2, c=3): # without args pass -fp = functools.partial(f, 1, 2) -gp = functools.partial(g, 1, c=2) -hp = functools.partial(h, 1, c=2) -bp = functools.partial(int, base=2) -assert dill.pickles(fp, safe=True) -assert dill.pickles(gp, safe=True) -assert dill.pickles(hp, safe=True) -assert dill.pickles(bp, safe=True) +def test_functools(): + fp = functools.partial(f, 1, 2) + gp = functools.partial(g, 1, c=2) + hp = functools.partial(h, 1, c=2) + bp = functools.partial(int, base=2) + + assert dill.pickles(fp, safe=True) + assert dill.pickles(gp, safe=True) + assert dill.pickles(hp, safe=True) + assert dill.pickles(bp, safe=True) + + +if __name__ == '__main__': + test_functools() From ed0bb7a4c0e3577b2b994b557170133c3edf7b42 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miguel=20S=C3=A1nchez=20de=20Le=C3=B3n=20Peque?= Date: Wed, 28 Dec 2016 23:22:14 +0100 Subject: [PATCH 68/77] Make test_mixins.py pytest-friendly --- tests/test_mixins.py | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/tests/test_mixins.py b/tests/test_mixins.py index 8313410d..65afc2fb 100644 --- a/tests/test_mixins.py +++ b/tests/test_mixins.py @@ -9,6 +9,7 @@ import dill dill.settings['recurse'] = True + def wtf(x,y,z): def zzz(): return x @@ -18,6 +19,7 @@ def xxx(): return z return zzz,yyy + def quad(a=1, b=1, c=0): inverted = [False] def invert(): @@ -38,8 +40,10 @@ def func(*args, **kwds): def double_add(*args): return sum(args) + fx = sum([1,2,3]) + ### to make it interesting... def quad_factory(a=1,b=1,c=0): def dec(f): @@ -49,25 +53,28 @@ def func(*args,**kwds): return func return dec + @quad_factory(a=0,b=4,c=0) def quadish(x): return x+1 + quadratic = quad_factory() + def doubler(f): def inner(*args, **kwds): fx = f(*args, **kwds) return 2*fx return inner + @doubler def quadruple(x): return 2*x -if __name__ == '__main__': - +def test_mixins(): # test mixins assert double_add(1,2,3) == 2*fx double_add.invert() @@ -110,4 +117,5 @@ def quadruple(x): #***** -# EOF +if __name__ == '__main__': + test_mixins() From 968fcc4bbf870e8e41435e78a3b33a21691b721b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miguel=20S=C3=A1nchez=20de=20Le=C3=B3n=20Peque?= Date: Wed, 28 Dec 2016 23:34:48 +0100 Subject: [PATCH 69/77] Make test_objects.py pytest-friendly --- tests/test_objects.py | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/tests/test_objects.py b/tests/test_objects.py index 989cc35d..72b9e1e3 100644 --- a/tests/test_objects.py +++ b/tests/test_objects.py @@ -27,7 +27,7 @@ class _class: def _method(self): pass - + # objects that *fail* if imported special = {} special['LambdaType'] = _lambda = lambda x: lambda y: x @@ -50,14 +50,13 @@ def pickles(name, exact=False): assert type(obj) == type(pik) except Exception: print ("fails: %s %s" % (name, type(obj))) - return - -if __name__ == '__main__': +def test_objects(): for member in objects.keys(): #pickles(member, exact=True) pickles(member, exact=False) -# EOF +if __name__ == '__main__': + test_objects() From 6aa69bd7076516fe310b1a1ea0c54dd44ffe4655 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miguel=20S=C3=A1nchez=20de=20Le=C3=B3n=20Peque?= Date: Wed, 28 Dec 2016 23:40:14 +0100 Subject: [PATCH 70/77] Make test_properties.py pytest-friendly --- tests/test_properties.py | 61 ++++++++++++++++++++++++---------------- 1 file changed, 37 insertions(+), 24 deletions(-) diff --git a/tests/test_properties.py b/tests/test_properties.py index 093428e7..1428d32b 100644 --- a/tests/test_properties.py +++ b/tests/test_properties.py @@ -6,9 +6,10 @@ # License: 3-clause BSD. The full license text is available at: # - http://trac.mystic.cacr.caltech.edu/project/pathos/browser/dill/LICENSE +import sys + import dill dill.settings['recurse'] = True -import sys class Foo(object): @@ -24,26 +25,38 @@ def _set_data(self, x): data = property(_get_data, _set_data) -FooS = dill.copy(Foo) - -assert FooS.data.fget is not None -assert FooS.data.fset is not None -assert FooS.data.fdel is None - -try: - res = FooS().data -except Exception: - e = sys.exc_info()[1] - raise AssertionError(str(e)) -else: - assert res == 1 - -try: - f = FooS() - f.data = 1024 - res = f.data -except Exception: - e = sys.exc_info()[1] - raise AssertionError(str(e)) -else: - assert res == 1024 +def test_data_not_none(): + FooS = dill.copy(Foo) + assert FooS.data.fget is not None + assert FooS.data.fset is not None + assert FooS.data.fdel is None + + +def test_data_unchanged(): + FooS = dill.copy(Foo) + try: + res = FooS().data + except Exception: + e = sys.exc_info()[1] + raise AssertionError(str(e)) + else: + assert res == 1 + + +def test_data_changed(): + FooS = dill.copy(Foo) + try: + f = FooS() + f.data = 1024 + res = f.data + except Exception: + e = sys.exc_info()[1] + raise AssertionError(str(e)) + else: + assert res == 1024 + + +if __name__ == '__main__': + test_data_not_none() + test_data_unchanged() + test_data_changed() From 54533cdcf8003634d413e1584e3d43b165500cd4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miguel=20S=C3=A1nchez=20de=20Le=C3=B3n=20Peque?= Date: Wed, 28 Dec 2016 23:40:49 +0100 Subject: [PATCH 71/77] Make test_weakref.py pytest-friendly --- tests/test_weakref.py | 76 +++++++++++++++++++++++-------------------- 1 file changed, 41 insertions(+), 35 deletions(-) diff --git a/tests/test_weakref.py b/tests/test_weakref.py index 3ac4e7f6..082e5cce 100644 --- a/tests/test_weakref.py +++ b/tests/test_weakref.py @@ -29,43 +29,49 @@ def __call__(self): def _function(): pass -o = _class() -oc = _class2() -n = _newclass() -nc = _newclass2() -f = _function -z = _class -x = _newclass -r = weakref.ref(o) -dr = weakref.ref(_class()) -p = weakref.proxy(o) -dp = weakref.proxy(_class()) -c = weakref.proxy(oc) -dc = weakref.proxy(_class2()) +def test_weakref(): + o = _class() + oc = _class2() + n = _newclass() + nc = _newclass2() + f = _function + z = _class + x = _newclass -m = weakref.ref(n) -dm = weakref.ref(_newclass()) -t = weakref.proxy(n) -dt = weakref.proxy(_newclass()) -d = weakref.proxy(nc) -dd = weakref.proxy(_newclass2()) + r = weakref.ref(o) + dr = weakref.ref(_class()) + p = weakref.proxy(o) + dp = weakref.proxy(_class()) + c = weakref.proxy(oc) + dc = weakref.proxy(_class2()) -fr = weakref.ref(f) -fp = weakref.proxy(f) -#zr = weakref.ref(z) #XXX: weakrefs not allowed for classobj objects -#zp = weakref.proxy(z) #XXX: weakrefs not allowed for classobj objects -xr = weakref.ref(x) -xp = weakref.proxy(x) + m = weakref.ref(n) + dm = weakref.ref(_newclass()) + t = weakref.proxy(n) + dt = weakref.proxy(_newclass()) + d = weakref.proxy(nc) + dd = weakref.proxy(_newclass2()) -objlist = [r,dr,m,dm,fr,xr, p,dp,t,dt, c,dc,d,dd, fp,xp] + fr = weakref.ref(f) + fp = weakref.proxy(f) + #zr = weakref.ref(z) #XXX: weakrefs not allowed for classobj objects + #zp = weakref.proxy(z) #XXX: weakrefs not allowed for classobj objects + xr = weakref.ref(x) + xp = weakref.proxy(x) -#dill.detect.trace(True) -for obj in objlist: - res = dill.detect.errors(obj) - if res: - print ("%s" % res) - #print ("%s:\n %s" % (obj, res)) -# else: -# print ("PASS: %s" % obj) - assert not res + objlist = [r,dr,m,dm,fr,xr, p,dp,t,dt, c,dc,d,dd, fp,xp] + #dill.detect.trace(True) + + for obj in objlist: + res = dill.detect.errors(obj) + if res: + print ("%s" % res) + #print ("%s:\n %s" % (obj, res)) + # else: + # print ("PASS: %s" % obj) + assert not res + + +if __name__ == '__main__': + test_weakref() From 1e1fb2ca0b5da35b39e0258a04d2e945360b801e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miguel=20S=C3=A1nchez=20de=20Le=C3=B3n=20Peque?= Date: Thu, 29 Dec 2016 00:01:21 +0100 Subject: [PATCH 72/77] Make test_file.py pytest friendly --- tests/test_file.py | 49 +++++++++++++++++++++++++++++++++------------- 1 file changed, 35 insertions(+), 14 deletions(-) diff --git a/tests/test_file.py b/tests/test_file.py index f18879c2..e0b6c38b 100644 --- a/tests/test_file.py +++ b/tests/test_file.py @@ -6,11 +6,16 @@ # License: 3-clause BSD. The full license text is available at: # - http://trac.mystic.cacr.caltech.edu/project/pathos/browser/dill/LICENSE -import dill -import random import os import sys import string +import random + +import pytest + +import dill + + dill.settings['recurse'] = True fname = "_test_file.txt" @@ -21,6 +26,7 @@ buffer_error = ValueError("invalid buffer size") dne_error = FileNotFoundError("[Errno 2] No such file or directory: '%s'" % fname) + def write_randomness(number=200): f = open(fname, "w") for i in range(number): @@ -45,7 +51,16 @@ def throws(op, args, exc): return False -def test(strictio, fmode): +def teardown_module(): + if os.path.exists(fname): + os.remove(fname) + + +def bench(strictio, fmode, skippypy): + import platform + if skippypy and platform.python_implementation() == 'PyPy': + pytest.skip('Skip for PyPy...') + # file exists, with same contents # read @@ -462,18 +477,24 @@ def test(strictio, fmode): f2.close() -if __name__ == '__main__': +def test_nostrictio_handlefmode(): + bench(False, dill.HANDLE_FMODE, False) - test(strictio=False, fmode=dill.HANDLE_FMODE) - test(strictio=False, fmode=dill.FILE_FMODE) - if not dill.dill.IS_PYPY: #FIXME: fails due to pypy/issues/1233 - test(strictio=False, fmode=dill.CONTENTS_FMODE) - #test(strictio=True, fmode=dill.HANDLE_FMODE) - #test(strictio=True, fmode=dill.FILE_FMODE) - #test(strictio=True, fmode=dill.CONTENTS_FMODE) +def test_nostrictio_filefmode(): + bench(False, dill.FILE_FMODE, False) -if os.path.exists(fname): - os.remove(fname) -# EOF +def test_nostrictio_contentsfmode(): + bench(False, dill.CONTENTS_FMODE, True) + + +#bench(True, dill.HANDLE_FMODE, False) +#bench(True, dill.FILE_FMODE, False) +#bench(True, dill.CONTENTS_FMODE, True) + + +if __name__ == '__main__': + test_nostrictio_handlefmode() + test_nostrictio_filefmode() + test_nostrictio_contentsfmode() From 7b0c89a31fb77f1d61f4538b6f85f344773b14b7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miguel=20S=C3=A1nchez=20de=20Le=C3=B3n=20Peque?= Date: Thu, 29 Dec 2016 00:45:06 +0100 Subject: [PATCH 73/77] Make test_nested.py pytest-friendly --- tests/test_nested.py | 61 +++++++++++++++++++++++++++----------------- 1 file changed, 38 insertions(+), 23 deletions(-) diff --git a/tests/test_nested.py b/tests/test_nested.py index bce97af9..bb49ebe1 100644 --- a/tests/test_nested.py +++ b/tests/test_nested.py @@ -9,10 +9,12 @@ test dill's ability to handle nested functions """ +import os +import math + import dill as pickle pickle.settings['recurse'] = True -import math -#import pickle + # the nested function: pickle should fail here, but dill is ok. def adder(augend): @@ -22,6 +24,7 @@ def inner(addend): return addend + augend + zero[0] return inner + # rewrite the nested function using a class: standard pickle should work here. class cadder(object): def __init__(self, augend): @@ -31,6 +34,7 @@ def __init__(self, augend): def __call__(self, addend): return addend + self.augend + self.zero[0] + # rewrite again, but as an old-style class class c2adder: def __init__(self, augend): @@ -40,22 +44,22 @@ def __init__(self, augend): def __call__(self, addend): return addend + self.augend + self.zero[0] -# some basic stuff -a = [0, 1, 2] # some basic class stuff class basic(object): pass + class basic2: pass -if __name__ == '__main__': - x = 5 - y = 1 +x = 5 +y = 1 + - # pickled basic stuff +def test_basic(): + a = [0, 1, 2] pa = pickle.dumps(a) pmath = pickle.dumps(math) #XXX: FAILS in pickle pmap = pickle.dumps(map) @@ -65,46 +69,49 @@ class basic2: lmap = pickle.loads(pmap) assert list(map(math.sin, a)) == list(lmap(lmath.sin, la)) - # pickled basic class stuff + +def test_basic_class(): pbasic2 = pickle.dumps(basic2) _pbasic2 = pickle.loads(pbasic2)() pbasic = pickle.dumps(basic) _pbasic = pickle.loads(pbasic)() - # pickled c2adder + +def test_c2adder(): pc2adder = pickle.dumps(c2adder) pc2add5 = pickle.loads(pc2adder)(x) assert pc2add5(y) == x+y - # pickled cadder + +def test_pickled_cadder(): pcadder = pickle.dumps(cadder) pcadd5 = pickle.loads(pcadder)(x) assert pcadd5(y) == x+y - # raw adder and inner + +def test_raw_adder_and_inner(): add5 = adder(x) assert add5(y) == x+y - # pickled adder + +def test_pickled_adder(): padder = pickle.dumps(adder) padd5 = pickle.loads(padder)(x) assert padd5(y) == x+y - # pickled inner + +def test_pickled_inner(): + add5 = adder(x) pinner = pickle.dumps(add5) #XXX: FAILS in pickle p5add = pickle.loads(pinner) assert p5add(y) == x+y - # testing moduledict where not __main__ + +def test_moduledict_where_not_main(): try: - import test_moduledict - error = None + from . import test_moduledict except: - import sys - error = sys.exc_info()[1] - assert error is None - # clean up - import os + import test_moduledict name = 'test_moduledict.py' if os.path.exists(name) and os.path.exists(name+'c'): os.remove(name+'c') @@ -117,4 +124,12 @@ class basic2: os.removedirs("__pycache__") -# EOF +if __name__ == '__main__': + test_basic() + test_basic_class() + test_c2adder() + test_pickled_cadder() + test_raw_adder_and_inner() + test_pickled_adder() + test_pickled_inner() + test_moduledict_where_not_main() From d9147956afb72d9b3e2960b8913a29d5ea45b6da Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miguel=20S=C3=A1nchez=20de=20Le=C3=B3n=20Peque?= Date: Fri, 6 Jan 2017 18:28:25 +0100 Subject: [PATCH 74/77] Remove pytest dependency and fix test teardown --- tests/test_file.py | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/tests/test_file.py b/tests/test_file.py index e0b6c38b..64fc9fac 100644 --- a/tests/test_file.py +++ b/tests/test_file.py @@ -11,8 +11,6 @@ import string import random -import pytest - import dill @@ -59,7 +57,8 @@ def teardown_module(): def bench(strictio, fmode, skippypy): import platform if skippypy and platform.python_implementation() == 'PyPy': - pytest.skip('Skip for PyPy...') + # Skip for PyPy... + return # file exists, with same contents # read @@ -479,14 +478,17 @@ def bench(strictio, fmode, skippypy): def test_nostrictio_handlefmode(): bench(False, dill.HANDLE_FMODE, False) + teardown_module() def test_nostrictio_filefmode(): bench(False, dill.FILE_FMODE, False) + teardown_module() def test_nostrictio_contentsfmode(): bench(False, dill.CONTENTS_FMODE, True) + teardown_module() #bench(True, dill.HANDLE_FMODE, False) From 664c1c643e7c22d46ac2abc55cc8fd182948eb5e Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Mon, 8 Dec 2014 19:05:40 +0000 Subject: [PATCH 75/77] Add fallback option for badly named namedtuples --- dill/dill.py | 28 +++++++++++++++++++--------- tests/test_classdef.py | 18 +++++++++++------- 2 files changed, 30 insertions(+), 16 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index e4f2b562..e941c7d6 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -717,6 +717,18 @@ def _create_array(f, args, state, npdict=None): array.__dict__.update(npdict) return array +def _create_namedtuple(name, fieldnames, modulename): + mod = _import_module(modulename, safe=True) + if mod is not None: + try: + return getattr(mod, name) + except: + pass + import collections + t = collections.namedtuple(name, fieldnames) + t.__module__ = modulename + return t + def _getattr(objclass, name, repr_str): # hack to grab the reference directly try: #XXX: works only for __builtin__ ? @@ -1087,7 +1099,7 @@ def _proxy_helper(obj): # a dead proxy returns a reference to None return id(None) if _str == _repr: return id(obj) # it's a repr try: # either way, it's a proxy from here - address = int(_str.rstrip('>').split(' at ')[-1], base=16) + address = int(_str.rstrip('>').split(' at ')[-1], base=16) except ValueError: # special case: proxy of a 'type' if not IS_PYPY: address = int(_repr.rstrip('>').split(' at ')[-1], base=16) @@ -1198,15 +1210,13 @@ def save_type(pickler, obj): log.info("T1: %s" % obj) pickler.save_reduce(_load_type, (_typemap[obj],), obj=obj) log.info("# T1") + elif issubclass(obj, tuple) and all([hasattr(obj, attr) for attr in ('_fields','_asdict','_make','_replace')]): + # special case: namedtuples + log.info("T6: %s" % obj) + pickler.save_reduce(_create_namedtuple, (getattr(obj, "__qualname__", obj.__name__), obj._fields, obj.__module__), obj=obj) + log.info("# T6") + return elif obj.__module__ == '__main__': - try: # use StockPickler for special cases [namedtuple,] - [getattr(obj, attr) for attr in ('_fields','_asdict', - '_make','_replace')] - log.info("T6: %s" % obj) - StockPickler.save_global(pickler, obj) - log.info("# T6") - return - except AttributeError: pass if issubclass(type(obj), type): # try: # used when pickling the class as code (or the interpreter) if is_dill(pickler) and not pickler._byref: diff --git a/tests/test_classdef.py b/tests/test_classdef.py index 0e47473f..21a99c90 100644 --- a/tests/test_classdef.py +++ b/tests/test_classdef.py @@ -93,20 +93,24 @@ def test_none(): Z = namedtuple("Z", ['a','b']) Zi = Z(0,1) X = namedtuple("Y", ['a','b']) - if hex(sys.hexversion) >= '0x30500f0': + X.__name__ = "X" + if hex(sys.hexversion) >= '0x30300f0': X.__qualname__ = "X" #XXX: name must 'match' or fails to pickle - else: - X.__name__ = "X" Xi = X(0,1) + Bad = namedtuple("FakeName", ['a','b']) + Badi = Bad(0,1) else: - Z = Zi = X = Xi = None + Z = Zi = X = Xi = Bad = Badi = None # test namedtuple def test_namedtuple(): - assert Z == dill.loads(dill.dumps(Z)) + assert Z is dill.loads(dill.dumps(Z)) assert Zi == dill.loads(dill.dumps(Zi)) - assert X == dill.loads(dill.dumps(X)) + assert X is dill.loads(dill.dumps(X)) assert Xi == dill.loads(dill.dumps(Xi)) + assert Bad is not dill.loads(dill.dumps(Bad)) + assert Bad._fields == dill.loads(dill.dumps(Bad))._fields + assert tuple(Badi) == tuple(dill.loads(dill.dumps(Badi))) def test_array_subclass(): try: @@ -115,7 +119,7 @@ def test_array_subclass(): class TestArray(np.ndarray): def __new__(cls, input_array, color): obj = np.asarray(input_array).view(cls) - obj.color = color + obj.color = color return obj def __array_finalize__(self, obj): if obj is None: From 616d1b2aaf287814a1cc51265cd7c6b8abff5937 Mon Sep 17 00:00:00 2001 From: Matthew Joyce Date: Wed, 29 Mar 2017 12:21:13 +0100 Subject: [PATCH 76/77] Fix issue #210 --- dill/dill.py | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/dill/dill.py b/dill/dill.py index e79eaa3e..4f742da5 100644 --- a/dill/dill.py +++ b/dill/dill.py @@ -163,17 +163,25 @@ class _member(object): SuperType = type(super(Exception, TypeError())) ItemGetterType = type(itemgetter(0)) AttrGetterType = type(attrgetter('__repr__')) -FileType = type(open(os.devnull, 'rb', buffering=0)) -TextWrapperType = type(open(os.devnull, 'r', buffering=-1)) -BufferedRandomType = type(open(os.devnull, 'r+b', buffering=-1)) -BufferedReaderType = type(open(os.devnull, 'rb', buffering=-1)) -BufferedWriterType = type(open(os.devnull, 'wb', buffering=-1)) + +def get_file_type(*args, **kwargs): + open = kwargs.pop("open", __builtin__.open) + f = open(os.devnull, *args, **kwargs) + t = type(f) + f.close() + return t + +FileType = get_file_type('rb', buffering=0) +TextWrapperType = get_file_type('r', buffering=-1) +BufferedRandomType = get_file_type('r+b', buffering=-1) +BufferedReaderType = get_file_type('rb', buffering=-1) +BufferedWriterType = get_file_type('wb', buffering=-1) try: from _pyio import open as _open - PyTextWrapperType = type(_open(os.devnull, 'r', buffering=-1)) - PyBufferedRandomType = type(_open(os.devnull, 'r+b', buffering=-1)) - PyBufferedReaderType = type(_open(os.devnull, 'rb', buffering=-1)) - PyBufferedWriterType = type(_open(os.devnull, 'wb', buffering=-1)) + PyTextWrapperType = get_file_type('r', buffering=-1, open=_open) + PyBufferedRandomType = get_file_type('r+b', buffering=-1, open=_open) + PyBufferedReaderType = get_file_type('rb', buffering=-1, open=_open) + PyBufferedWriterType = get_file_type('wb', buffering=-1, open=_open) except ImportError: PyTextWrapperType = PyBufferedRandomType = PyBufferedReaderType = PyBufferedWriterType = None try: @@ -555,7 +563,7 @@ def _create_rlock(count, owner, *args): #XXX: ignores 'blocking' lock = RLockType() if owner is not None: lock._acquire_restore((count, owner)) - if owner and not lock._is_owned(): + if owner and not lock._is_owned(): raise UnpicklingError("Cannot acquire lock") return lock From 9ec4ddf8c114f4285d9da1e2d51e250aa004dcd4 Mon Sep 17 00:00:00 2001 From: Jean Boussier Date: Thu, 15 Jun 2017 00:07:00 -0400 Subject: [PATCH 77/77] Fix typo in REAMDE --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 33a2c7cf..7b490a85 100644 --- a/README.md +++ b/README.md @@ -21,7 +21,7 @@ session. `dill` can be used to store python objects to a file, but the primary usage is to send python objects across the network as a byte stream. `dill` is quite flexible, and allows arbitrary user defined classes -and funcitons to be serialized. Thus `dill` is not intended to be +and functions to be serialized. Thus `dill` is not intended to be secure against erroneously or maliciously constructed data. It is left to the user to decide whether the data they unpickle is from a trustworthy source.