[gephi-dev] Adding TAR.GZ archives support

Martin Skurla bujacik at gmail.com
Sat Nov 27 00:42:13 CET 2010


Because what you did was just gzip decompression, but when tar+gzip
will be used, the first 512bytes header as well as last zero bytes
which will round 512bytes will be processed too and it will not
work...

nice tar descripton can be found here:
http://en.wikipedia.org/wiki/Tar_(file_format)

2010/11/27 Martin Skurla <bujacik at gmail.com>:
> @Sebastian
>
> I just saw your commit and you added GZip support, but not tar + gzip.
> This code is working and could be added with minor chages if you want
> to support tar + gzip or just tar format:
> package testingtar;
>
> import java.io.BufferedReader;
> import java.io.ByteArrayInputStream;
> import java.io.File;
> import java.io.FileInputStream;
> import java.io.IOException;
> import java.io.InputStream;
> import java.io.InputStreamReader;
> import java.util.zip.GZIPInputStream;
>
> public class Main {
>    private static final String TAR_GZ_FILE_PATH =
> "C:\\Users\\Martin\\Documents\\pgadmin.tar.gz";
>
>    private static final int FILE_SIZE_OFFSET = 124;
>    private static final int FILE_SIZE_LENGTH = 12;
>    private static final int HEADER_LENGTH    = 512;
>
>    public static void main(String[] args) {
>        File compressedFile = new File(TAR_GZ_FILE_PATH);
>
>        try {
>            InputStream inputStream = new GZIPInputStream(
>                                      new FileInputStream(compressedFile)) ;
>
>            ignoreBytes(inputStream, FILE_SIZE_OFFSET);
>            String fileSizeLengthOctalString = readString(inputStream,
> FILE_SIZE_LENGTH).trim();
>
>            int fileSize = Integer.parseInt(fileSizeLengthOctalString, 8);
>
>            ignoreBytes(inputStream, HEADER_LENGTH - (FILE_SIZE_OFFSET
> + FILE_SIZE_LENGTH));
>
>            byte[] byteContent = readBytes(inputStream, fileSize);
>
>            BufferedReader reader = new BufferedReader(
>                                    new InputStreamReader(
>                                    new ByteArrayInputStream(byteContent)));
>            String line;
>            while ((line = reader.readLine()) != null)
>                System.out.println("\\" + line + "/");
>        }
>        catch (IOException e) {
>            e.printStackTrace();
>        }
>    }
>
>    private static void ignoreBytes(InputStream inputStream, int
> numberOfBytes) throws IOException {
>        for (int counter = 0; counter < numberOfBytes; counter++)
>            inputStream.read();
>    }
>
>    private static String readString(InputStream inputStream, int
> numberOfBytes) throws IOException {
>        return new String(readBytes(inputStream, numberOfBytes));
>    }
>
>    private static byte[] readBytes(InputStream inputStream, int
> numberOfBytes) throws IOException {
>        byte[] readBytes = new byte [numberOfBytes];
>        inputStream.read(readBytes);
>
>        return readBytes;
>    }
> }
>
>
>
>
> 2010/11/26 Sébastien Heymann <sebastien.heymann at gephi.org>:
>> @Mathieu
>>
>> There is a big performance issue with archive files: they are uncompressed
>> twice. Once in ImportControllerImpl.java, and one in
>> DesktopImportControllerUI.java.
>>
>> What about storing the reference to the uncompressed file inside the
>> FileImporter?
>>
>> Seb
>>
>> 2010/11/26 Mathieu Bastian <mathieu.bastian at gmail.com>
>>>
>>> Hello,
>>>
>>> > I was looking at source codes and what I found is following:
>>> >
>>> > 1. zip is supported because of native zip/jar support included in
>>> > NetBeans platform
>>> > 2. it supports currently only jar/zip, because of native java support
>>> > for zip format (classes like ZipInputStream, ZipEntry, ...)
>>> > 3. tar format simply packs more files into 1 file adding some metadata
>>> > stored in 512kB store & gzip adds additional compression
>>> > 4. java automatically supports gzip format too (GZIPInputStream,
>>> > although it is more low level)
>>> >
>>> > => I think the patch adding tar.gzip will be quite easy to implement.
>>> >
>>> > Mathieu, am I right about upper notes?
>>>
>>> Yes, completeley correct, and yes we support only one file in it,
>>> without folder.
>>>
>>> > Does current support for jar/zip supports only one file in zip?
>>> > Where should the patch be merged into?
>>> >
>>> > //Martin
>>> >
>>> >
>>> > 2010/11/26 Sébastien Heymann <sebastien.heymann at gephi.org>:
>>> >> Hello,
>>> >>
>>> >> The Gephi File Importer can handle ZIP files. I need to add the support
>>> >> of
>>> >> TAR.GZ archives.
>>> >>
>>> >> Could you point me some doc / where should I look at inside the source
>>> >> code?
>>> >>
>>> >> Thanks,
>>> >> Seb
>>> >>
>>> >> _______________________________________________
>>> >> gephi-dev mailing list
>>> >> gephi-dev at lists.gephi.org
>>> >> http://gephi.org/mailman/listinfo/gephi-dev
>>> >>
>>> >>
>>> > _______________________________________________
>>> > gephi-dev mailing list
>>> > gephi-dev at lists.gephi.org
>>> > http://gephi.org/mailman/listinfo/gephi-dev
>>
>>
>


More information about the gephi-dev mailing list