轻量级NoSQL数据库WhiteDB

显示全部楼层 · 2014-2-19 11:55:14

WhiteDB 是一个轻量级 NoSQL 数据库，使用 C 语言编写。整个数据库运行在内存中，没有服务器进程，数据直接通过共享内存读写，无需使用 Socket。
特性：
支持索引 (T-tree)
通过日志和内存 dump 进行持久化
通过锁实现并发
查询限制
json, CSV and RDF 支持
支持 Linux 和 Windows
提供 Python 版本
命令行工具
json REST 工具

WhiteDB.jpg (18.09 KB, 下载次数: 141)
下载附件
2013-10-26 09:06 上传

千问 · 2014-2-19 11:55:14

官网：http://whitedb.org/

WhiteDB1.jpg (31.5 KB, 下载次数: 12)
下载附件
2013-10-26 09:07 上传

千问 · 2014-2-19 11:55:14

Project goals
speed
portability
simplicity and small footprint
low memory usage
easy to use in embedded systems
graph database applications
extended rdf database applications
fast interprocess communication
seamless integration with a wGandalf rule engine (work in progress)

Data storage
Data is kept in shared memory by default, making all the data accessible to separate processes.
Each database record is a tuple of N elements, encoded in WhiteDB-s simple compact format. You can store both conventional datatypes and direct pointers to records: the latter enables highly efficient traversal of complex data.

Supported features
indexes (T-tree)
persistence through logging and memory dumps
concurrency through locking
limited queries (conjunctive only)
json, CSV and RDF support
Linux and Windows
Python bindings
command line utility tools
json REST tools

千问 · 2014-2-19 11:55:14

Direct memory access
Each record is stored as an array (N-tuple) of integers: configurable as either 32 or 64 bits. The integers in the tuple encode values directly or as pointers. Columns have no type: any encoded value can be stored to any field.
You can always get a direct pointer to a record, store it into a field of a record or use it in your own program directly. A record pointer can thus be used as an automatically assigned id of the record which requires no search at all to access the record.
To search for a record, either scan the chain of all records, scan a sublist/tree you have built yourself or perform an index search on an indexed field.
Data encoding
The low bits of an integer in a record indicate the type of data. Anything which does not fit into the remainining bits is allocated separately and pointed to by the same integer.
The datatypes are null, record(pointer), integer, double, string, xml literal, uri, blob, char, date, time, pointer to record.
Long strings are allocated uniquely, i.e. using the same string in many fields does not take up additional space and allows fast string equality check.
A record pointer is a persistent offset of the record, usable as an automatic id of the record. Pointers allow fast traversal of complex data without search.
Allocation and garbage collection
Conventional malloc does not function in shared memory, since we have to use offsets instead of conventional pointers. Hence WhiteDB uses its own implementation of malloc for shared memory.
A record and a uniquely kept long string can be pointed to from several fields. Hence we use reference counting garbage collection embedded into our allocation algorithm when deleting records and long strings. Reference counting is incremental and does not cause long pauses.

千问 · 2014-2-19 11:55:14

Locking
We use a database level lock implemented via a task-fair atomic spinlock queue for concurrency control, but alternative faster and simpler preference policies can be configured: either a reader-preference or a writer-preference spinlock.
Generally, a database level lock is characterized by very low overhead but maximum possible contention. This means that processes should spend as little time between acquiring a lock and releasing it, as possible.
Indexes
The simplest index provided is a T-tree index on any field containing any mixture of objects (integers, strings, etc). The index is automatically maintained when records are added, deleted or changed.
The efficiency of indexing can be greatly enhanced by using template indexes, which create an index only for records having a given value in a given field. For example, create an index on column 0 that only contains records where the 2-nd column is equal to 6.
Unique storage of long strings, xml literals and uri-s uses hash indexes internally.
Persistent storage
Two mechanisms are available for storing the shared memory database to disk. First, the whole database can be dumped and restored. Since the database uses offsets instead of conventional pointers, the absolute adress locations are not important.
Second, all inserts, deletions and updates can be logged to a file. The compact log thus created can be played back to restore the contents of the database (normally after the last dump). Logging can be switched on and off, depending on the data criticality/performance requirements.

千问 · 2014-2-19 11:55:14

1. Introduction
This tutorial will cover the basic usage of WhiteDB’s C API. Most examples you will encounter here will also be available in the Examples directory of the WhiteDB source distribution.
A thorough reference of the API is available in the Manual. If you’re looking for information on how to use WhiteDB from Python, please see Python documentation (also in semi-tutorial form).
2. Compiling the examples
Before we can get started with the tutorial, we need to know how to compile and run programs that use WhiteDB.
If you invoked the standard ./configure; make; make install, things are quite simple. Let’s try to compile Examples/demo.c (it’s already compiled, but let’s do it again).gcc -o Examples/mydemo Examples/demo.c -lwgdb复制代码

千问 · 2014-2-19 11:55:14

That’s it! This is how you’d normally compile and link a WhiteDB program. The -lwgdb tells the linker to use libwgdb.a that sits somewhere in your library path. If you get an error at this point, it may be that your computer has libraptor installed. In that case WhiteDB automatically decides that it wants to use it; just add -lraptor to the command line and all should be fine.
You can now run Examples/mydemo and see what it does. Also, you may skip the rest of this section and go directly to Section 3, “Connecting to the database”.
In the event that you haven’t run make install (let’s say you’re still evaluating and getting acquainted with WhiteDB) or you’ve installed it in a location that’s not in your standard library path, you’ll need to add a few things.gcc -L/alternate/libpath -o Examples/mydemo Examples/demo.c -lwgdb复制代码

千问 · 2014-2-19 11:55:14

Alternate libpath is where libwgdb.a is on your machine. If you didn’t make install at all, it is still in Main/.libs. Also, your system probably does not know where to look for the libraries, so if you just run Examples/mydemo, it will exit with "error while loading shared libraries".LD_LIBRARY_PATH=/alternate/libpath Examples/mydemo复制代码will take care of that. However, things will definitely be easier once you’ve done make install. Another issue that you might run into is that the compiler does not know where the API header files are. The programs in the Examples directory work around that by referring to the headers in the Db directory directly, but you might prefer a more flexible way. In this case, trygcc -L/alternate/libpath -I/path/to/headers -o myprog myprog.c -lwgdb复制代码where /path/to/headers is where dbapi.h is located.
Finally, in the event that your system does not have the make program or some other part of the toolchain required for the standard installation, but you still have the C compiler, such as gcc, you may directly compile the examples or your own programs with the WhiteDB sources. Have a look at Examples/compile_demo.sh to see what source files should be compiled.

千问 · 2014-2-19 11:55:14

2.1. So you’re on Windows
So far the there hasn’t been a peep about following the tutorial on a Windows computer, but don’t feel left out - the compilation is different enough that it deserves it’s own section.
You need the MSVC compiler (provided by Microsoft Visual Studio, Express Edition, for example). Set it up so you can run cl.exe from the command prompt. Visual Studio includes it’s own command prompt menu entry that has the environment set up correctly for you.
First, we recommend that you compile the wgdb.lib. If you followed the installation documentation, you probably have it in WhiteDB’s directory already. If not, run compile.bat. This produces everything you’ll need for the tutorial. Now try:cl.exe /FeMYDEMO Examples\demo.c wgdb.lib复制代码

千问 · 2014-2-19 11:55:14

This produces mydemo.exe in the current directory. As long as the file wgdb.dll is also in the same directory, you can run mydemo.exe and see the output.
The above command works for the distributed examples, but it should be pointed out that the following way is more flexible, once you’ve started creating your own programs:cl.exe /I"\path\to\whitedb\headers" yourprog.c \path\to\wgdb.lib复制代码Replace the \path\to..-s with the actual directories where the WhiteDB files are on your computer.