numpy.memmap¶
- class numpy.memmap[source]¶
Create a memory-map to an array stored in a binary file on disk.
Memory-mapped files are used for accessing small segments of large files on disk, without reading the entire file into memory. Numpy’s memmap’s are array-like objects. This differs from Python’s mmap module, which uses file-like objects.
This subclass of ndarray has some unpleasant interactions with some operations, because it doesn’t quite fit properly as a subclass. An alternative to using this subclass is to create the mmap object yourself, then create an ndarray with ndarray.__new__ directly, passing the object created in its ‘buffer=’ parameter.
This class may at some point be turned into a factory function which returns a view into an mmap buffer.
Parameters: filename : str or file-like object
The file name or file object to be used as the array data buffer.
dtype : data-type, optional
The data-type used to interpret the file contents. Default is uint8.
mode : {‘r+’, ‘r’, ‘w+’, ‘c’}, optional
The file is opened in this mode:
‘r’ Open existing file for reading only. ‘r+’ Open existing file for reading and writing. ‘w+’ Create or overwrite existing file for reading and writing. ‘c’ Copy-on-write: assignments affect data in memory, but changes are not saved to disk. The file on disk is read-only. Default is ‘r+’.
offset : int, optional
In the file, array data starts at this offset. Since offset is measured in bytes, it should normally be a multiple of the byte-size of dtype. When mode != 'r', even positive offsets beyond end of file are valid; The file will be extended to accommodate the additional data. The default offset is 0.
shape : tuple, optional
order : {‘C’, ‘F’}, optional
Specify the order of the ndarray memory layout: C (row-major) or Fortran (column-major). This only has an effect if the shape is greater than 1-D. The default order is ‘C’.
Notes
The memmap object can be used anywhere an ndarray is accepted. Given a memmap fp, isinstance(fp, numpy.ndarray) returns True.
Memory-mapped arrays use the Python memory-map object which (prior to Python 2.5) does not allow files to be larger than a certain size depending on the platform. This size is always < 2GB even on 64-bit systems.
Examples
>>> data = np.arange(12, dtype='float32') >>> data.resize((3,4))
This example uses a temporary file so that doctest doesn’t write files to your directory. You would use a ‘normal’ filename.
>>> from tempfile import mkdtemp >>> import os.path as path >>> filename = path.join(mkdtemp(), 'newfile.dat')
Create a memmap with dtype and shape that matches our data:
>>> fp = np.memmap(filename, dtype='float32', mode='w+', shape=(3,4)) >>> fp memmap([[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]], dtype=float32)
Write data to memmap array:
>>> fp[:] = data[:] >>> fp memmap([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]], dtype=float32)
>>> fp.filename == path.abspath(filename) True
Deletion flushes memory changes to disk before removing the object:
>>> del fp
Load the memmap and verify data was stored:
>>> newfp = np.memmap(filename, dtype='float32', mode='r', shape=(3,4)) >>> newfp memmap([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]], dtype=float32)
Read-only memmap:
>>> fpr = np.memmap(filename, dtype='float32', mode='r', shape=(3,4)) >>> fpr.flags.writeable False
Copy-on-write memmap:
>>> fpc = np.memmap(filename, dtype='float32', mode='c', shape=(3,4)) >>> fpc.flags.writeable True
It’s possible to assign to copy-on-write array, but values are only written into the memory copy of the array, and not written to disk:
>>> fpc memmap([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]], dtype=float32) >>> fpc[0,:] = 0 >>> fpc memmap([[ 0., 0., 0., 0.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]], dtype=float32)
File on disk is unchanged:
>>> fpr memmap([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]], dtype=float32)
Offset into a memmap:
>>> fpo = np.memmap(filename, dtype='float32', mode='r', offset=16) >>> fpo memmap([ 4., 5., 6., 7., 8., 9., 10., 11.], dtype=float32)
Attributes
filename str Path to the mapped file. offset int Offset position in the file. mode str File mode. Methods