[gephi-dev] Adding TAR.GZ archives support

Martin Skurla bujacik at gmail.com
Sat Nov 27 00:39:44 CET 2010


@Sebastian

I just saw your commit and you added GZip support, but not tar + gzip.
This code is working and could be added with minor chages if you want
to support tar + gzip or just tar format:
package testingtar;

import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.zip.GZIPInputStream;

public class Main {
    private static final String TAR_GZ_FILE_PATH =
"C:\\Users\\Martin\\Documents\\pgadmin.tar.gz";

    private static final int FILE_SIZE_OFFSET = 124;
    private static final int FILE_SIZE_LENGTH = 12;
    private static final int HEADER_LENGTH    = 512;

    public static void main(String[] args) {
        File compressedFile = new File(TAR_GZ_FILE_PATH);

        try {
            InputStream inputStream = new GZIPInputStream(
                                      new FileInputStream(compressedFile)) ;

            ignoreBytes(inputStream, FILE_SIZE_OFFSET);
            String fileSizeLengthOctalString = readString(inputStream,
FILE_SIZE_LENGTH).trim();

            int fileSize = Integer.parseInt(fileSizeLengthOctalString, 8);

            ignoreBytes(inputStream, HEADER_LENGTH - (FILE_SIZE_OFFSET
+ FILE_SIZE_LENGTH));

            byte[] byteContent = readBytes(inputStream, fileSize);

            BufferedReader reader = new BufferedReader(
                                    new InputStreamReader(
                                    new ByteArrayInputStream(byteContent)));
            String line;
            while ((line = reader.readLine()) != null)
                System.out.println("\\" + line + "/");
        }
        catch (IOException e) {
            e.printStackTrace();
        }
    }

    private static void ignoreBytes(InputStream inputStream, int
numberOfBytes) throws IOException {
        for (int counter = 0; counter < numberOfBytes; counter++)
            inputStream.read();
    }

    private static String readString(InputStream inputStream, int
numberOfBytes) throws IOException {
        return new String(readBytes(inputStream, numberOfBytes));
    }

    private static byte[] readBytes(InputStream inputStream, int
numberOfBytes) throws IOException {
        byte[] readBytes = new byte [numberOfBytes];
        inputStream.read(readBytes);

        return readBytes;
    }
}




2010/11/26 Sébastien Heymann <sebastien.heymann at gephi.org>:
> @Mathieu
>
> There is a big performance issue with archive files: they are uncompressed
> twice. Once in ImportControllerImpl.java, and one in
> DesktopImportControllerUI.java.
>
> What about storing the reference to the uncompressed file inside the
> FileImporter?
>
> Seb
>
> 2010/11/26 Mathieu Bastian <mathieu.bastian at gmail.com>
>>
>> Hello,
>>
>> > I was looking at source codes and what I found is following:
>> >
>> > 1. zip is supported because of native zip/jar support included in
>> > NetBeans platform
>> > 2. it supports currently only jar/zip, because of native java support
>> > for zip format (classes like ZipInputStream, ZipEntry, ...)
>> > 3. tar format simply packs more files into 1 file adding some metadata
>> > stored in 512kB store & gzip adds additional compression
>> > 4. java automatically supports gzip format too (GZIPInputStream,
>> > although it is more low level)
>> >
>> > => I think the patch adding tar.gzip will be quite easy to implement.
>> >
>> > Mathieu, am I right about upper notes?
>>
>> Yes, completeley correct, and yes we support only one file in it,
>> without folder.
>>
>> > Does current support for jar/zip supports only one file in zip?
>> > Where should the patch be merged into?
>> >
>> > //Martin
>> >
>> >
>> > 2010/11/26 Sébastien Heymann <sebastien.heymann at gephi.org>:
>> >> Hello,
>> >>
>> >> The Gephi File Importer can handle ZIP files. I need to add the support
>> >> of
>> >> TAR.GZ archives.
>> >>
>> >> Could you point me some doc / where should I look at inside the source
>> >> code?
>> >>
>> >> Thanks,
>> >> Seb
>> >>
>> >> _______________________________________________
>> >> gephi-dev mailing list
>> >> gephi-dev at lists.gephi.org
>> >> http://gephi.org/mailman/listinfo/gephi-dev
>> >>
>> >>
>> > _______________________________________________
>> > gephi-dev mailing list
>> > gephi-dev at lists.gephi.org
>> > http://gephi.org/mailman/listinfo/gephi-dev
>
>


More information about the gephi-dev mailing list