python – 使用太大而无法存储在内存中的文件?
作者:互联网
我有一个20 GB的文件,如下所示:
Read name, Start position, Direction, Sequence
请注意,读取名称不一定是唯一的.
例如.我的文件片段看起来像
Read1, 40009348, +, AGTTTTCGTA
Read2, 40009349, -, AGCCCTTCGG
Read1, 50994530, -, AGTTTTCGTA
我希望能够以允许我的方式存储这些行
>保持文件根据第二个值排序
>迭代已排序的文件
似乎可以使用数据库.
文档似乎暗示dbm不能用于对文件进行排序并对其进行迭代.
因此,我想知道SQLite3是否能够做1)和2).我知道我将能够使用SQL查询对文件进行排序,并使用sqlite3迭代结果集.但是,如果没有4GB的RAM计算机内存不足,我能够做到这一点吗?
解决方法:
SQLite is able做1)和2).
我建议您尝试并报告您遇到的任何问题.
With the default page size of 1024 bytes, an SQLite database is limited in size to 2 terabytes (241 bytes). And even if it could handle larger databases, SQLite stores the entire database in a single disk file and many filesystems limit the maximum size of files to something less than this. So if you are contemplating databases of this magnitude, you would do well to consider using a client/server database engine that spreads its content across multiple disk files, and perhaps across multiple volumes.
标签:python,large-files,sqlite3,dbm 来源: https://codeday.me/bug/20190718/1494402.html