Using the DataOutput/InputBuffer classes

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Using the DataOutput/InputBuffer classes

alakshman
Hi All

I have been trying to use the DataOutputBuffer class for its obvious memory
efficiency. I basically write some data into the buffer and then write the
buffer into a file (an instance of RandomAccessFile) by invoking
buffer.getData(). However what I am seeing is that a lot of garbage is being
written into the file which manifests itself as a series of '@' characters
in Linux and spaces on Windows.

This is my usage :

DataOutputBuffer buffer = new DataOutputBuffer();
RandomAccessFile raf  = new RandomAccessFile(file, "rw");

for ( each data in some data structure )
{
    buffer.reset();
    serialize data into buffer;
    raf.write(buffer.getData());
}

When I use ByteArrayOutputStream and a DataOutputStream to do the same task
the size of the generated file is 29K. However when I use the
DataOutputBuffer the size of the file for the same dataset it 507K. Is my
usage correct ?

Please advice

THanks
A
Reply | Threaded
Open this post in threaded view
|

Re: Using the DataOutput/InputBuffer classes

Brian Harrington
 From the apidocs for DataOutputBuffer: "Returns the current contents of
the buffer. Data is only valid to |getLength()|
<http://lucene.apache.org/hadoop/api/org/apache/hadoop/io/DataOutputBuffer.html#getLength%28%29>."

Try:

raf.write(buffer.getData(), 0, buffer.getLength());

Brian


Phantom wrote:

> Hi All
>
> I have been trying to use the DataOutputBuffer class for its obvious memory
> efficiency. I basically write some data into the buffer and then write the
> buffer into a file (an instance of RandomAccessFile) by invoking
> buffer.getData(). However what I am seeing is that a lot of garbage is being
> written into the file which manifests itself as a series of '@' characters
> in Linux and spaces on Windows.
>
> This is my usage :
>
> DataOutputBuffer buffer = new DataOutputBuffer();
> RandomAccessFile raf  = new RandomAccessFile(file, "rw");
>
> for ( each data in some data structure )
> {
>     buffer.reset();
>     serialize data into buffer;
>     raf.write(buffer.getData());
> }
>
> When I use ByteArrayOutputStream and a DataOutputStream to do the same task
> the size of the generated file is 29K. However when I use the
> DataOutputBuffer the size of the file for the same dataset it 507K. Is my
> usage correct ?
>
> Please advice
>
> THanks
> A
>
>