编程语言
首页 > 编程语言> > java-在Clojure中解压缩zlib流

java-在Clojure中解压缩zlib流

作者:互联网

我有一个二进制文件,其内容由Python上的zlib.compress创建,是否有一种简单的方法可以在Clojure中打开和解压缩它?

import zlib
import json

with open('data.json.zlib', 'wb') as f:
    f.write(zlib.compress(json.dumps(data).encode('utf-8')))

基本上,它不是gzip文件,只是表示deflated数据的字节.

我只能找到这些参考,但不能完全找到我想要的(我认为前两个最相关):

> deflateclj_hatemogi_clojure/deflate.clj
> funcool/buddy-core/deflate.clj
> Compressing / Decompressing strings in clojure
> Reading and Writing Compressed Files
> clj-http

我必须真的将这个多行包装器实现为java.util.zip还是那里有一个不错的库?实际上,我什至不确定这些字节流是否跨库兼容,或者我是否只是试图混合并匹配错误的库.

Python中的步骤:

>>> '{"hello": "world"}'.encode('utf-8')
b'{"hello": "world"}'
>>> zlib.compress(b'{"hello": "world"}')
b'x\x9c\xabV\xcaH\xcd\xc9\xc9W\xb2RP*\xcf/\xcaIQ\xaa\x05\x009\x99\x06\x17'
>>> [int(i) for i in zlib.compress(b'{"hello": "world"}')]
[120, 156, 171, 86, 202, 72, 205, 201, 201, 87, 178, 82, 80, 42, 207, 47, 202, 73, 81, 170, 5, 0, 57, 153, 6, 23]
>>> import numpy
>>> [numpy.int8(i) for i in zlib.compress(b'{"hello": "world"}')]
[120, -100, -85, 86, -54, 72, -51, -55, -55, 87, -78, 82, 80, 42, -49, 47, -54, 73, 81, -86, 5, 0, 57, -103, 6, 23]
>>> zlib.decompress(bytes([120, 156, 171, 86, 202, 72, 205, 201, 201, 87, 178, 82, 80, 42, 207, 47, 202, 73, 81, 170, 5, 0, 57, 153, 6, 23])).decode('utf-8')
'{"hello": "world"}'

Clojure中的解码尝试:

; https://github.com/funcool/buddy-core/blob/master/src/buddy/util/deflate.clj#L40 without try-catch
(ns so.core
  (:import java.io.ByteArrayInputStream
           java.io.ByteArrayOutputStream
           java.util.zip.Deflater
           java.util.zip.DeflaterOutputStream
           java.util.zip.InflaterInputStream
           java.util.zip.Inflater
           java.util.zip.ZipException)
  (:gen-class))

(defn uncompress
  "Given a compressed data as byte-array, uncompress it and return as an other byte array."
  ([^bytes input] (uncompress input nil))
  ([^bytes input {:keys [nowrap buffer-size]
                  :or {nowrap true buffer-size 2048}
                  :as opts}]
   (let [buf  (byte-array (int buffer-size))
         os   (ByteArrayOutputStream.)
         inf  (Inflater. ^Boolean nowrap)]
     (with-open [is  (ByteArrayInputStream. input)
                 iis (InflaterInputStream. is inf)]
       (loop []
         (let [readed (.read iis buf)]
           (when (pos? readed)
             (.write os buf 0 readed)
             (recur)))))
     (.toByteArray os))))

(uncompress (byte-array [120, -100, -85, 86, -54, 72, -51, -55, -55, 87, -78, 82, 80, 42, -49, 47, -54, 73, 81, -86, 5, 0, 57, -103, 6, 23]))
ZipException invalid stored block lengths  java.util.zip.InflaterInputStream.read (InflaterInputStream.java:164)

任何帮助,将不胜感激.我不想使用zip或gzip文件,因为我只关心原始内容,而不关心这种情况下的文件名或修改日期.但是,如果是唯一的选择,则可以在Python端使用其他压缩算法.

解决方法:

这是使用gzip的一种简单方法:

Python代码:

import gzip
content = "the quick brown fox"
with gzip.open('fox.txt.gz', 'wb') as f:
    f.write(content)

Clojure代码:

(with-open [in (java.util.zip.GZIPInputStream.
                (clojure.java.io/input-stream
                 "fox.txt.gz"))]
  (println "result:" (slurp in)))

;=>  result: the quick brown fox

请记住,“ gzip”是一种算法和一种格式,并不意味着您需要使用“ gzip”命令行工具.

请注意,Clojure的输入不必是文件.您可以将gzip压缩数据作为原始字节通过套接字发送,并且仍然在Clojure端将其解压缩.有关详细信息,请访问:https://clojuredocs.org/clojure.java.io/input-stream

更新资料

如果您需要使用纯zlib格式而不是gzip,则结果非常相似:

Python代码:

import zlib
fp = open( 'balloon.txt.z', 'wb' )
fp.write( zlib.compress( 'the big red baloon' ))
fp.close()

Clojure代码:

(with-open [in (java.util.zip.InflaterInputStream.
                (clojure.java.io/input-stream
                 "balloon.txt.z"))]
  (println "result:" (slurp in)))

;=> result: the big red baloon

标签:deflate,clojure,compression,gzip,java
来源: https://codeday.me/bug/20191111/2022237.html