The following tutorial covers about what is memory mapped files and what are the advantages and drawbacks of using Memory-Mapped Files and also covers that how to map a file into memory with example code.
Any files can be accessed using
1. Simple File I/O
2. Memory-Mapped Files
Some of the drawbacks of Simple File I/O (Usual read() and write()) is as follows.
When an application requires to read data from outside such as file data on disk (outside of virtual / process address space) , system call to usual file I/O functions (e.g., read() and write() subroutines ) , copies the file data to intermediate buffer . Then the data is transferred to the physical file or the process . This Intermediate buffering is slow and expensive which reduces the I/O performance.
The alternative mechanishm is Memory mapped files . Memory mapped files provide a mechanism to map the file data into the area of Virtual Memory (process address space) . This enables an application, including multiple processes, to read and write the file data directly to the memory without performing any explicit file read or write operations on the physical file . When we access a part of the file which is not in memory, it will be automatically paged in by the OS. Subsequent reads / writes to / from that page are treated as ordinary memory accesses . There is no separation between modifying the data and saving it to a disk.
Some of the benefits using Memory mapped files ( Accessing a data directly from main memory )
1. Eliminate intermediary buffering
2. Increases I/O performance
3. More than one processes can map the same file i.e pages in memory can be be shared among the processes which saves memory space and supports inter-process communications
4. supports lazy loading i.e the process of allocating and loading pages in main memory must be deferred as long as possible . The page is loaded into RAM when the page is actually needed . You don't need to have memory for the entire file. This helps to read a large file with small amount of RAM
5. File data can be accessed and modified with out having to execute any explicit I/O operations on the file.
6. Reading / Writing large files this is often more efficient than invoking the usual read or write methods.
Mapping a file into memory is implemented by a FileChannel object that is packaged with java.nio which is available from JDK 1.4 . The map() method of a FileChannel object maps to a portion or all of channel’s file into memory and returns a reference to a buffer of type MappedByteBuffer .
Syntax for the map() method is
public abstract MappedByteBuffer map(FileChannel.MapMode mode, long position, long size) throws IOException
- Maps a region of this channel's file directly into memory. The map() method returns a MappedByteBuffer, which is a subclass of ByteBuffer. Methods of ByteBuffer can be used with MappedByteBuffer class .
A region of a file may be mapped into memory in one of the following three modes:
1. MapMode.READ_ONLY - Can not modify the resulting buffer
2. MapMode.READ_WRITE - Can change the resulting buffer
3. MapMode.PRIVATE - creates a private copies of the modified portions of the buffer which is not visible to other processes hat have mapped the same file. Modification to the resulting buffer will not be reproduced to the file
The following line of code maps the first 1024 bytes of a file into memory in Read / write mode.
MappedByteBuffer mbb = fc.map( FileChannel.MapMode.READ_WRITE, 0, 1024 );
To map the entire file specify the start file position as zero, and the length that is mapped as the length of the file.
MappedByteBuffer mbb= fc.map(READ_WRITE, 0L,fc.size()).load();
The buffer is created with the READ_WRITE mode, which permits the buffer to be accessed or modified and maps to the entire file. The map() method returns a reference to the MappedByteBuffer object
Drawbacks of Memory mapped files
1. Wastage of memory for small files . In Memory mapped files , disk block is mapped to a page in memory . The size of the page is usually 4 KB . To map a file with size of 9 KB , 3 pages are allocated with total size of 12 KB in Memory. 3 KB memory is wasted.
2. When a requested page is not in the main memory , page fault occurs which reduces performance.
3. Another limitation is Maping of file contents in memory depends on available Virtual Address Space. 32 bit OS gets a set of virtual memory addresses from 0 to 4,294,967,295 (2*32-1 = 4 GB) .
Now let us see the code example of Maemory Mapped Files . I have written two programs to read a large log file using the Standard File IO and Memory-mapped I/O and you can run the two programs to get the time taken to read the given big log files by Standard File IO and Memory-mapped I/O . Obviously , Reading large file using Memory-mapped I/O is faster than using Standard File IO.
Using Memory-mapped I/O
import java.io.FileInputStream;
import java.io.*;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
public class MemoryMappedIO1 {
public static void main(String[] args) {
long tm = 0;
FileInputStream fis = null;
try {
fis = new FileInputStream("CBS.log");
int len=1024;
byte[] buf = new byte[len];
tm = System.currentTimeMillis();
FileChannel fc = fis.getChannel();
MappedByteBuffer mbb = fc.map(FileChannel.MapMode.READ_ONLY, 0, fc.size());
while (mbb.hasRemaining()) {
if (len>mbb.remaining())
mbb.get(buf,0,mbb.remaining());
else
mbb.get(buf,0,len);
//System.out.println(new String(buf));
}
System.out.printf("Time to read file TestLog.log: %d ms\n", (System.currentTimeMillis()-
tm));
}
catch (Exception ex) {
ex.printStackTrace(System.err);
}
finally {
if (fis != null) {
try {
fis.close();
}
catch (Exception ex) {
}
}
}
}
}
Using Standard File IO:
import java.io.BufferedReader;
import java.io.*;
public class StandardBufferedIO1 {
public static void main(String[] args) {
long ts, te = 0;
InputStream in = null;
try {
in=new FileInputStream("CBS.log");
ts = System.currentTimeMillis();
byte[] buf = new byte[1024];
int len;
while ((len = in.read(buf)) !=-1) {
//System.out.println(new String(buf));
// out.write(buf, 0, len);
}
te=System.currentTimeMillis();
System.out.printf("Time taken to read log file %d ms\n", (te-ts));
}
catch (Exception e) {
e.printStackTrace(System.err);
}
finally {
if (in!= null) {
try {
in.close();
}
catch (Exception e) {e.printStackTrace(System.err); }
}
}
}
}
Any files can be accessed using
1. Simple File I/O
2. Memory-Mapped Files
Some of the drawbacks of Simple File I/O (Usual read() and write()) is as follows.
When an application requires to read data from outside such as file data on disk (outside of virtual / process address space) , system call to usual file I/O functions (e.g., read() and write() subroutines ) , copies the file data to intermediate buffer . Then the data is transferred to the physical file or the process . This Intermediate buffering is slow and expensive which reduces the I/O performance.
The alternative mechanishm is Memory mapped files . Memory mapped files provide a mechanism to map the file data into the area of Virtual Memory (process address space) . This enables an application, including multiple processes, to read and write the file data directly to the memory without performing any explicit file read or write operations on the physical file . When we access a part of the file which is not in memory, it will be automatically paged in by the OS. Subsequent reads / writes to / from that page are treated as ordinary memory accesses . There is no separation between modifying the data and saving it to a disk.
Some of the benefits using Memory mapped files ( Accessing a data directly from main memory )
1. Eliminate intermediary buffering
2. Increases I/O performance
3. More than one processes can map the same file i.e pages in memory can be be shared among the processes which saves memory space and supports inter-process communications
4. supports lazy loading i.e the process of allocating and loading pages in main memory must be deferred as long as possible . The page is loaded into RAM when the page is actually needed . You don't need to have memory for the entire file. This helps to read a large file with small amount of RAM
5. File data can be accessed and modified with out having to execute any explicit I/O operations on the file.
6. Reading / Writing large files this is often more efficient than invoking the usual read or write methods.
Mapping a file into memory is implemented by a FileChannel object that is packaged with java.nio which is available from JDK 1.4 . The map() method of a FileChannel object maps to a portion or all of channel’s file into memory and returns a reference to a buffer of type MappedByteBuffer .
Syntax for the map() method is
public abstract MappedByteBuffer map(FileChannel.MapMode mode, long position, long size) throws IOException
- Maps a region of this channel's file directly into memory. The map() method returns a MappedByteBuffer, which is a subclass of ByteBuffer. Methods of ByteBuffer can be used with MappedByteBuffer class .
A region of a file may be mapped into memory in one of the following three modes:
1. MapMode.READ_ONLY - Can not modify the resulting buffer
2. MapMode.READ_WRITE - Can change the resulting buffer
3. MapMode.PRIVATE - creates a private copies of the modified portions of the buffer which is not visible to other processes hat have mapped the same file. Modification to the resulting buffer will not be reproduced to the file
The following line of code maps the first 1024 bytes of a file into memory in Read / write mode.
MappedByteBuffer mbb = fc.map( FileChannel.MapMode.READ_WRITE, 0, 1024 );
To map the entire file specify the start file position as zero, and the length that is mapped as the length of the file.
MappedByteBuffer mbb= fc.map(READ_WRITE, 0L,fc.size()).load();
The buffer is created with the READ_WRITE mode, which permits the buffer to be accessed or modified and maps to the entire file. The map() method returns a reference to the MappedByteBuffer object
Drawbacks of Memory mapped files
1. Wastage of memory for small files . In Memory mapped files , disk block is mapped to a page in memory . The size of the page is usually 4 KB . To map a file with size of 9 KB , 3 pages are allocated with total size of 12 KB in Memory. 3 KB memory is wasted.
2. When a requested page is not in the main memory , page fault occurs which reduces performance.
3. Another limitation is Maping of file contents in memory depends on available Virtual Address Space. 32 bit OS gets a set of virtual memory addresses from 0 to 4,294,967,295 (2*32-1 = 4 GB) .
Now let us see the code example of Maemory Mapped Files . I have written two programs to read a large log file using the Standard File IO and Memory-mapped I/O and you can run the two programs to get the time taken to read the given big log files by Standard File IO and Memory-mapped I/O . Obviously , Reading large file using Memory-mapped I/O is faster than using Standard File IO.
Using Memory-mapped I/O
import java.io.FileInputStream;
import java.io.*;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
public class MemoryMappedIO1 {
public static void main(String[] args) {
long tm = 0;
FileInputStream fis = null;
try {
fis = new FileInputStream("CBS.log");
int len=1024;
byte[] buf = new byte[len];
tm = System.currentTimeMillis();
FileChannel fc = fis.getChannel();
MappedByteBuffer mbb = fc.map(FileChannel.MapMode.READ_ONLY, 0, fc.size());
while (mbb.hasRemaining()) {
if (len>mbb.remaining())
mbb.get(buf,0,mbb.remaining());
else
mbb.get(buf,0,len);
//System.out.println(new String(buf));
}
System.out.printf("Time to read file TestLog.log: %d ms\n", (System.currentTimeMillis()-
tm));
}
catch (Exception ex) {
ex.printStackTrace(System.err);
}
finally {
if (fis != null) {
try {
fis.close();
}
catch (Exception ex) {
}
}
}
}
}
Using Standard File IO:
import java.io.BufferedReader;
import java.io.*;
public class StandardBufferedIO1 {
public static void main(String[] args) {
long ts, te = 0;
InputStream in = null;
try {
in=new FileInputStream("CBS.log");
ts = System.currentTimeMillis();
byte[] buf = new byte[1024];
int len;
while ((len = in.read(buf)) !=-1) {
//System.out.println(new String(buf));
// out.write(buf, 0, len);
}
te=System.currentTimeMillis();
System.out.printf("Time taken to read log file %d ms\n", (te-ts));
}
catch (Exception e) {
e.printStackTrace(System.err);
}
finally {
if (in!= null) {
try {
in.close();
}
catch (Exception e) {e.printStackTrace(System.err); }
}
}
}
}