a

Let’s Build a Simple Database

How Does a Database Work?

  • What format is data saved in? (in memory and on disk)

数据以什么格式保存?(在内存和磁盘上)

  • When does it move from memory to disk?

它何时从内存移动到磁盘?

  • Why can there only be one primary key per table?

为什么每个表只能有一个主键?

  • How does rolling back a transaction work?

回滚事务如何工作?

  • How are indexes formatted?

索引的格式如何?

  • When and how does a full table scan happen?

何时以及如何进行全表扫描?

  • What format is a prepared statement saved in?

预处理语句以什么格式保存?

In short, how does a database work?

简而言之,数据库是如何工作的?

I’m building a clone of sqlite from scratch in C in order to understand, and I’m going to document my process as I go.

为了理解,我正在用 C 从头开始构建 sqlite 的克隆,并且我将随时记录我的流程。

Part 1 - Introduction and Setting up the REPL

第 1 部分 - 简介和设置 REPL

sqlite

sqlite文档

谷歌book

There’s lots of documentation of sqlite internals on their website, plus I’ve got a copy of SQLite Database System: Design and Implementation.

他们的网站上有很多关于sqlite内部的文档,另外我还有一份SQLite数据库系统:设计和实现。

A query goes through a chain of components in order to retrieve or modify data. The front-end consists of the:

查询通过组件链来检索或修改数据。前端包括:

  • tokenizer 分词器
  • parser 解析 器
  • code generator 代码生成器

The input to the front-end is a SQL query. the output is sqlite virtual machine bytecode (essentially a compiled program that can operate on the database).

前端的输入是 SQL 查询。输出是SQLite虚拟机字节码(本质上是一个可以在数据库上运行的编译程序)。

The back-end consists of the:
后端包括:

  • virtual machine 虚拟机
  • B-tree B树
  • pager 呼叫器
  • os interface 界面

The virtual machine takes bytecode generated by the front-end as instructions. It can then perform operations on one or more tables or indexes, each of which is stored in a data structure called a B-tree. The VM is essentially a big switch statement on the type of bytecode instruction.

虚拟机将前端生成的字节码作为指令。然后,它可以对一个或多个表或索引执行操作,每个表或索引都存储在称为 B 树的数据结构中。VM 本质上是关于字节码指令类型的大开关语句。

Each B-tree consists of many nodes. Each node is one page in length. The B-tree can retrieve a page from disk or save it back to disk by issuing commands to the pager.

每个 B 树由许多节点组成。每个节点的长度为一页。B树可以从磁盘检索页面,也可以通过向寻呼机发出命令将其保存回磁盘。

The pager receives commands to read or write pages of data. It is responsible for reading/writing at appropriate offsets in the database file. It also keeps a cache of recently-accessed pages in memory, and determines when those pages need to be written back to disk.

寻呼机接收用于读取或写入数据页的命令。它负责在数据库文件中以适当的偏移量读取/写入。它还在内存中保留最近访问的页面的缓存,并确定何时需要将这些页面写回磁盘。

The os interface is the layer that differs depending on which operating system sqlite was compiled for. In this tutorial, I’m not going to support multiple platforms.

操作系统接口是因编译 sqlite 所针对的操作系统而异的层。在本教程中,我不打算支持多个平台。

A journey of a thousand miles begins with a single step, so let’s start with something a little more straightforward: the REPL.

千里之行始于足下,所以让我们从更直接的东西开始:REPL。

Making a Simple REPL

制作一个简单的 REPL

Sqlite starts a read-execute-print loop when you start it from the command line:

当您从命令行启动 Sqlite 时,它会启动读取-执行-打印循环:

~ sqlite3
SQLite version 3.16.0 2016-11-04 19:09:39
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table users (id int, username varchar(255), email varchar(255));
sqlite> .tables
users
sqlite> .exit
~

To do that, our main function will have an infinite loop that prints the prompt, gets a line of input, then processes that line of input:

为此,我们的 main 函数将有一个无限循环,用于打印提示,获取一行输入,然后处理该行输入:

int main(int argc, char* argv[]) {
  InputBuffer* input_buffer = new_input_buffer();
  while (true) {
    print_prompt();
    read_input(input_buffer);

    if (strcmp(input_buffer->buffer, ".exit") == 0) {
      close_input_buffer(input_buffer);
      exit(EXIT_SUCCESS);
    } else {
      printf("Unrecognized command '%s'.\n", input_buffer->buffer);
    }
  }
}

We’ll define InputBuffer as a small wrapper around the state we need to store to interact with getline(). (More on that in a minute)

我们将 InputBuffer 定义为我们需要存储以与 getline() 交互的状态的小包装器。(稍后会详细介绍)

typedef struct {
  char* buffer;
  size_t buffer_length;
  ssize_t input_length;
} InputBuffer;

InputBuffer* new_input_buffer() {
  InputBuffer* input_buffer = (InputBuffer*)malloc(sizeof(InputBuffer));
  input_buffer->buffer = NULL;
  input_buffer->buffer_length = 0;
  input_buffer->input_length = 0;

  return input_buffer;
}

ext, print_prompt() prints a prompt to the user. We do this before reading each line of input.

接下来, print_prompt() 向用户打印提示。我们在读取每行输入之前执行此操作。

void print_prompt() { printf("db > "); }

To read a line of input, use getline():

要读取输入行,请使用 getline():

ssize_t getline(char **lineptr, size_t *n, FILE *stream);

lineptr : a pointer to the variable we use to point to the buffer containing the read line. If it set to NULL it is mallocatted by getline and should thus be freed by the user, even if the command fails.

lineptr :指向我们用来指向包含读取行的缓冲区的变量的指针。如果它设置为 NULL ,则它被 getline 错误定位,因此即使命令失败,用户也应释放它。

n : a pointer to the variable we use to save the size of allocated buffer.

n :指向我们用来保存分配缓冲区大小的变量的指针。

stream : the input stream to read from. We’ll be reading from standard input.

stream :要从中读取的输入流。我们将从标准输入中读取。

return value : the number of bytes read, which may be less than the size of the buffer.

return value :读取的字节数,可能小于缓冲区的大小。

We tell getline to store the read line in input_buffer->buffer and the size of the allocated buffer in input_buffer->buffer_length. We store the return value in input_buffer->input_length.

我们告诉 getline 将读取行存储在 input_buffer->buffer 中,将分配的缓冲区的大小存储在 input_buffer->buffer_length 中。我们将返回值存储在 input_buffer->input_length 中。

buffer starts as null, so getline allocates enough memory to hold the line of input and makes buffer point to it.

buffer 以 null 开头,因此 getline 分配足够的内存来保存输入行并使其指向 buffer 。

void read_input(InputBuffer* input_buffer) {
  ssize_t bytes_read =
      getline(&(input_buffer->buffer), &(input_buffer->buffer_length), stdin);

  if (bytes_read <= 0) {
    printf("Error reading input\n");
    exit(EXIT_FAILURE);
  }

  // Ignore trailing newline
  input_buffer->input_length = bytes_read - 1;
  input_buffer->buffer[bytes_read - 1] = 0;
}

Now it is proper to define a function that frees the memory allocated for an instance of InputBuffer * and the buffer element of the respective structure (getline allocates memory for input_buffer->buffer in read_input).

现在定义一个函数来释放分配给 InputBuffer * 实例和相应结构的 buffer 元素的内存是合适的( getline 为 input_buffer->buffer in read_input 分配内存)。

void close_input_buffer(InputBuffer* input_buffer) {
    free(input_buffer->buffer);
    free(input_buffer);
}

Finally, we parse and execute the command. There is only one recognized command right now : .exit, which terminates the program. Otherwise we print an error message and continue the loop.

最后,我们解析并执行命令。现在只有一个可识别的命令: .exit ,它终止程序。否则,我们将打印错误消息并继续循环。

if (strcmp(input_buffer->buffer, ".exit") == 0) {
  close_input_buffer(input_buffer);
  exit(EXIT_SUCCESS);
} else {
  printf("Unrecognized command '%s'.\n", input_buffer->buffer);
}

本章代码(实现读取命令行一行,识别.exit并退出)

#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct
{
    char *buffer; // 读取的字符(缓冲区)
    size_t buffer_length;
    ssize_t input_length;
} InputBuffer;

// 创建InputBuffer
InputBuffer *new_input_buffer()
{
    // 分配内存给input
    InputBuffer *input_buffer = (InputBuffer *)malloc(sizeof(InputBuffer));
    input_buffer->buffer = NULL;
    input_buffer->buffer_length = 0;
    input_buffer->input_length = 0;

    return input_buffer;
}

// 打印提示
void print_prompt() { printf("db > "); }

// 读取一行(linux才有(猜的))
// ssize_t getline(char **lineptr, size_t *n, FILE *stream);
// gcc编译报错没有getline(windows)
ssize_t mygetline(char **line, size_t *n, FILE *fp)
{
    char *buf = *line;
    ssize_t c, i = 0; // i来记录字符串长度,c来存储字符
    if (buf == NULL || *n == 0)
    {
        *line = malloc(10);
        buf = *line;
        *n = 10;
    }
    // buf为或n为0时动态为期分配空间
    while ((c = fgetc(fp)) != '\n')
    {
        if (c == EOF)
            return -1;
        if (i < *n - 2) // 留2个空间给\n和\0
        {
            *(buf + i++) = c;
        }
        else
        {
            *n = *n + 10;
            buf = realloc(buf, *n); // 空间不足时,重新进行分配
            *(buf + i++) = c;
        }
    }
    *(buf + i++) = '\n';
    *(buf + i) = '\0';
    return i;
}

// 读取输入
void read_input(InputBuffer *input_buffer)
{
    ssize_t bytes_read =
        mygetline(&(input_buffer->buffer), &(input_buffer->buffer_length), stdin);

    if (bytes_read <= 0)
    {
        printf("Error reading input\n");
        exit(EXIT_FAILURE);
    }

    // Ignore trailing newline
    input_buffer->input_length = bytes_read - 1;
    input_buffer->buffer[bytes_read - 1] = 0;
}

// 释放inputBuffer占用的内存
void close_input_buffer(InputBuffer *input_buffer)
{
    free(input_buffer->buffer);
    free(input_buffer);
}

int main(int argc, char *argv[])
{
    InputBuffer *input_buffer = new_input_buffer();
    while (true)
    {
        print_prompt();
        // 借由InputBuffer接收输入
        read_input(input_buffer);

        // 解析命令 .exit
        // 如果输入的是.exit,就退出
        if (strcmp(input_buffer->buffer, ".exit") == 0)
        {
            // 释放内存
            close_input_buffer(input_buffer);
            exit(EXIT_SUCCESS);
        }
        else
        {
            // 未知命令
            printf("Unrecognized command '%s'.\n", input_buffer->buffer);
        }
    }
}

Part 2 - World’s Simplest SQL Compiler and Virtual Machine

第 2 部分 - 世界上最简单的 SQL 编译器和虚拟机

The “front-end” of sqlite is a SQL compiler that parses a string and outputs an internal representation called bytecode.

我们正在制作 sqlite 的克隆。sqlite 的“前端”是一个 SQL 编译器,它解析字符串并输出称为字节码的内部表示。

This bytecode is passed to the virtual machine, which executes it.

此字节码将传递到虚拟机,由虚拟机执行它。

Breaking things into two steps like this has a couple advantages:

像这样将事情分成两个步骤有几个优点:

  • Reduces the complexity of each part (e.g. virtual machine does not worry about syntax errors)

降低每个部分的复杂性(例如,虚拟机不必担心语法错误)

  • Allows compiling common queries once and caching the bytecode for improved performance

允许编译一次常见查询并缓存字节码以提高性能

With this in mind, let’s refactor our main function and support two new keywords in the process:

考虑到这一点,让我们重构我们的 main 函数,并在此过程中支持两个新关键字:

int main(int argc, char *argv[])
{
    InputBuffer *input_buffer = new_input_buffer();
    while (true)
    {
        print_prompt();
        // 借由InputBuffer接收输入
        read_input(input_buffer);

        if (input_buffer->buffer[0] == '.')
        {
            switch (do_meta_command(input_buffer))
            {
            case (META_COMMAND_SUCCESS):
                continue;
            case (META_COMMAND_UNRECOGNIZED_COMMAND):
                printf("Unrecognized command '%s'\n", input_buffer->buffer);
                continue;
            }
        }
        Statement statement;
        switch (prepare_statement(input_buffer, &statement))
        {
        case (PREPARE_SUCCESS):
            break;
        case (PREPARE_UNRECOGNIZED_STATEMENT):
            printf("Unrecognized keyword at start of '%s'.\n",
                   input_buffer->buffer);
            continue;
        }
        execute_statement(&statement);
        printf("Executed.\n");
    }
}

Non-SQL statements like .exit are called “meta-commands”. They all start with a dot, so we check for them and handle them in a separate function.

像 .exit 这样的非 SQL 语句称为“元命令”。它们都以点开头,因此我们检查它们并在单独的函数中处理它们。

Next, we add a step that converts the line of input into our internal representation of a statement. This is our hacky version of the sqlite front-end.

接下来,我们添加一个步骤,将输入行转换为语句的内部表示形式。这是我们的 sqlite 前端的黑客版本。

Lastly, we pass the prepared statement to execute_statement. This function will eventually become our virtual machine.

最后,我们将准备好的语句传递给 execute_statement 。这个功能最终将成为我们的虚拟机。

Notice that two of our new functions return enums indicating success or failure:

请注意,我们的两个新函数返回指示成功或失败的枚举:

typedef enum {
  META_COMMAND_SUCCESS,
  META_COMMAND_UNRECOGNIZED_COMMAND
} MetaCommandResult;

typedef enum { PREPARE_SUCCESS, PREPARE_UNRECOGNIZED_STATEMENT } PrepareResult;

“Unrecognized statement”? That seems a bit like an exception. I prefer not to use exceptions (and C doesn’t even support them), so I’m using enum result codes wherever practical. The C compiler will complain if my switch statement doesn’t handle a member of the enum, so we can feel a little more confident we handle every result of a function. Expect more result codes to be added in the future.

“无法识别的声明”?这似乎有点像一个例外。我不喜欢使用异常(C 甚至不支持它们),所以我在可行的情况下使用枚举结果代码。如果我的switch语句没有处理枚举的成员,C编译器会抱怨,所以我们可以更有信心地处理函数的每个结果。预计将来会添加更多结果代码。

do_meta_command is just a wrapper for existing functionality that leaves room for more commands:

do_meta_command 只是现有功能的包装器,为更多命令留出了空间:

MetaCommandResult do_meta_command(InputBuffer* input_buffer) {
  if (strcmp(input_buffer->buffer, ".exit") == 0) {
    exit(EXIT_SUCCESS);
  } else {
    return META_COMMAND_UNRECOGNIZED_COMMAND;
  }
}

Our “prepared statement” right now just contains an enum with two possible values. It will contain more data as we allow parameters in statements:

我们现在的“准备好的语句”只包含一个具有两个可能值的枚举。它将包含更多数据,因为我们允许在语句中使用参数:

typedef enum { STATEMENT_INSERT, STATEMENT_SELECT } StatementType;

typedef struct {
  StatementType type;
} Statement;

prepare_statement (our “SQL Compiler”) does not understand SQL right now. In fact, it only understands two words:

prepare_statement (我们的“SQL 编译器”)现在不理解 SQL。其实它只懂两个字:

PrepareResult prepare_statement(InputBuffer* input_buffer,
                                Statement* statement) {
  if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
    statement->type = STATEMENT_INSERT;
    return PREPARE_SUCCESS;
  }
  if (strcmp(input_buffer->buffer, "select") == 0) {
    statement->type = STATEMENT_SELECT;
    return PREPARE_SUCCESS;
  }

  return PREPARE_UNRECOGNIZED_STATEMENT;
}

Note that we use strncmp for “insert” since the “insert” keyword will be followed by data. (e.g. insert 1 cstack foo@bar.com)

请注意,我们使用 strncmp 表示“插入”,因为 “insert” 关键字后跟数据。(例如 insert 1 cstack foo@bar.com

Lastly, execute_statement contains a few stubs:

最后, execute_statement 包含几个存根:

void execute_statement(Statement* statement) {
  switch (statement->type) {
    case (STATEMENT_INSERT):
      printf("This is where we would do an insert.\n");
      break;
    case (STATEMENT_SELECT):
      printf("This is where we would do a select.\n");
      break;
  }
}

Note that it doesn’t return any error codes because there’s nothing that could go wrong yet.

请注意,它不会返回任何错误代码,因为目前还没有可能出错的问题。

With these refactors, we now recognize two new keywords!

通过这些重构,我们现在认识到两个新关键字!

The skeleton of our database is taking shape… wouldn’t it be nice if it stored data? In the next part, we’ll implement insert and select, creating the world’s worst data store. In the mean time, here’s the entire diff from this part:

我们数据库的骨架正在形成…如果它存储数据不是很好吗?在下一部分中,我们将实现 insert 和 select ,创建世界上最差的数据存储。同时,这是与这部分的全部区别:

@@ -10,6 +10,23 @@ struct InputBuffer_t {
 } InputBuffer;
 
+typedef enum {
+  META_COMMAND_SUCCESS,
+  META_COMMAND_UNRECOGNIZED_COMMAND
+} MetaCommandResult;
+
+typedef enum { PREPARE_SUCCESS, PREPARE_UNRECOGNIZED_STATEMENT } PrepareResult;
+
+typedef enum { STATEMENT_INSERT, STATEMENT_SELECT } StatementType;
+
+typedef struct {
+  StatementType type;
+} Statement;
+
 InputBuffer* new_input_buffer() {
   InputBuffer* input_buffer = malloc(sizeof(InputBuffer));
   input_buffer->buffer = NULL;
@@ -40,17 +57,67 @@ void close_input_buffer(InputBuffer* input_buffer) {
     free(input_buffer);
 }
 
+MetaCommandResult do_meta_command(InputBuffer* input_buffer) {
+  if (strcmp(input_buffer->buffer, ".exit") == 0) {
+    close_input_buffer(input_buffer);
+    exit(EXIT_SUCCESS);
+  } else {
+    return META_COMMAND_UNRECOGNIZED_COMMAND;
+  }
+}
+
+PrepareResult prepare_statement(InputBuffer* input_buffer,
+                                Statement* statement) {
+  if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
+    statement->type = STATEMENT_INSERT;
+    return PREPARE_SUCCESS;
+  }
+  if (strcmp(input_buffer->buffer, "select") == 0) {
+    statement->type = STATEMENT_SELECT;
+    return PREPARE_SUCCESS;
+  }
+
+  return PREPARE_UNRECOGNIZED_STATEMENT;
+}
+
+void execute_statement(Statement* statement) {
+  switch (statement->type) {
+    case (STATEMENT_INSERT):
+      printf("This is where we would do an insert.\n");
+      break;
+    case (STATEMENT_SELECT):
+      printf("This is where we would do a select.\n");
+      break;
+  }
+}
+
 int main(int argc, char* argv[]) {
   InputBuffer* input_buffer = new_input_buffer();
   while (true) {
     print_prompt();
     read_input(input_buffer);
 
-    if (strcmp(input_buffer->buffer, ".exit") == 0) {
-      close_input_buffer(input_buffer);
-      exit(EXIT_SUCCESS);
-    } else {
-      printf("Unrecognized command '%s'.\n", input_buffer->buffer);
+    if (input_buffer->buffer[0] == '.') {
+      switch (do_meta_command(input_buffer)) {
+        case (META_COMMAND_SUCCESS):
+          continue;
+        case (META_COMMAND_UNRECOGNIZED_COMMAND):
+          printf("Unrecognized command '%s'\n", input_buffer->buffer);
+          continue;
+      }
     }
+
+    Statement statement;
+    switch (prepare_statement(input_buffer, &statement)) {
+      case (PREPARE_SUCCESS):
+        break;
+      case (PREPARE_UNRECOGNIZED_STATEMENT):
+        printf("Unrecognized keyword at start of '%s'.\n",
+               input_buffer->buffer);
+        continue;
+    }
+
+    execute_statement(&statement);
+    printf("Executed.\n");
   }
 }
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct
{
    char *buffer; // 读取的字符(缓冲区)
    size_t buffer_length;
    ssize_t input_length;
} InputBuffer;

// 创建InputBuffer
InputBuffer *new_input_buffer()
{
    // 分配内存给input
    InputBuffer *input_buffer = (InputBuffer *)malloc(sizeof(InputBuffer));
    input_buffer->buffer = NULL;
    input_buffer->buffer_length = 0;
    input_buffer->input_length = 0;

    return input_buffer;
}

// 打印提示
void print_prompt() { printf("db > "); }

// 读取一行(linux才有(猜的))
// ssize_t getline(char **lineptr, size_t *n, FILE *stream);
// gcc编译报错没有getline(windows)
ssize_t mygetline(char **line, size_t *n, FILE *fp)
{
    char *buf = *line;
    ssize_t c, i = 0; // i来记录字符串长度,c来存储字符
    if (buf == NULL || *n == 0)
    {
        *line = malloc(10);
        buf = *line;
        *n = 10;
    }
    // buf为或n为0时动态为期分配空间
    while ((c = fgetc(fp)) != '\n')
    {
        if (c == EOF)
            return -1;
        if (i < *n - 2) // 留2个空间给\n和\0
        {
            *(buf + i++) = c;
        }
        else
        {
            *n = *n + 10;
            buf = realloc(buf, *n); // 空间不足时,重新进行分配
            *(buf + i++) = c;
        }
    }
    *(buf + i++) = '\n';
    *(buf + i) = '\0';
    return i;
}

// 读取输入
void read_input(InputBuffer *input_buffer)
{
    ssize_t bytes_read =
        mygetline(&(input_buffer->buffer), &(input_buffer->buffer_length), stdin);

    if (bytes_read <= 0)
    {
        printf("Error reading input\n");
        exit(EXIT_FAILURE);
    }

    // Ignore trailing newline
    input_buffer->input_length = bytes_read - 1;
    input_buffer->buffer[bytes_read - 1] = 0;
}

// 释放inputBuffer占用的内存
void close_input_buffer(InputBuffer *input_buffer)
{
    free(input_buffer->buffer);
    free(input_buffer);
}

typedef enum
{
    META_COMMAND_SUCCESS,
    META_COMMAND_UNRECOGNIZED_COMMAND
} MetaCommandResult;

typedef enum
{
    PREPARE_SUCCESS,
    PREPARE_UNRECOGNIZED_STATEMENT
} PrepareResult;

// 解析 . 开头的参数
MetaCommandResult do_meta_command(InputBuffer *input_buffer)
{
    if (strcmp(input_buffer->buffer, ".exit") == 0)
    {
        // 退出系统
        exit(EXIT_SUCCESS);
    }
    else
    {
        return META_COMMAND_UNRECOGNIZED_COMMAND;
    }
}

// 语句关键字类型(insert/select)
typedef enum
{
    STATEMENT_INSERT,
    STATEMENT_SELECT
} StatementType;

// 语句类型
typedef struct
{
    StatementType type;
} Statement;

// 解析关键字开头的语句(select xxx)
PrepareResult prepare_statement(InputBuffer *input_buffer,
                                Statement *statement)
{
    // C 库函数 int strncmp(const char *str1, const char *str2, size_t n)
    // 把 str1 和 str2 进行比较,最多比较前 n 个字节
    if (strncmp(input_buffer->buffer, "insert", 6) == 0)
    {
        statement->type = STATEMENT_INSERT;
        return PREPARE_SUCCESS;
    }
    if (strcmp(input_buffer->buffer, "select") == 0)
    {
        statement->type = STATEMENT_SELECT;
        return PREPARE_SUCCESS;
    }

    return PREPARE_UNRECOGNIZED_STATEMENT;
}

// 执行语句
void execute_statement(Statement *statement)
{
    switch (statement->type)
    {
    case (STATEMENT_INSERT):
        printf("This is where we would do an insert.\n");
        break;
    case (STATEMENT_SELECT):
        printf("This is where we would do a select.\n");
        break;
    }
}

int main(int argc, char *argv[])
{
    InputBuffer *input_buffer = new_input_buffer();
    while (true)
    {
        print_prompt();
        // 借由InputBuffer接收输入
        read_input(input_buffer);

        if (input_buffer->buffer[0] == '.')
        {
            switch (do_meta_command(input_buffer))
            {
            case (META_COMMAND_SUCCESS):
                continue;
            case (META_COMMAND_UNRECOGNIZED_COMMAND):
                printf("Unrecognized command '%s'\n", input_buffer->buffer);
                continue;
            }
        }
        Statement statement;
        // 解析并判断是否是可解析语句,不是就打印错误
        switch (prepare_statement(input_buffer, &statement))
        {
        case (PREPARE_SUCCESS):
            break;
        case (PREPARE_UNRECOGNIZED_STATEMENT):
            printf("Unrecognized keyword at start of '%s'.\n",
                   input_buffer->buffer);
            continue;
        }
        execute_statement(&statement);
        printf("Executed.\n");
    }
}

Part 3 - An In-Memory, Append-Only, Single-Table Database

第 3 部分 - 内存中、仅追加、单表数据库

We’re going to start small by putting a lot of limitations on our database. For now, it will:

我们将从小处着手,对数据库施加很多限制。目前,它将:

  • support two operations: inserting a row and printing all rows

支持两种操作:插入一行和打印所有行

  • reside only in memory (no persistence to disk)

仅驻留在内存中(无磁盘持久性)

  • support a single, hard-coded table

支持单个硬编码表

Our hard-coded table is going to store users and look like this:

我们的硬编码表将存储用户,如下所示:

This is a simple schema, but it gets us to support multiple data types and multiple sizes of text data types.

这是一个简单的架构,但它使我们能够支持多种数据类型和多种大小的文本数据类型。

insert statements are now going to look like this:

insert 语句现在将如下所示:

insert 1 cstack foo@bar.com

That means we need to upgrade our prepare_statement function to parse arguments

这意味着我们需要升级我们的 prepare_statement 函数来解析参数

PrepareResult prepare_statement(InputBuffer *input_buffer,
                                Statement *statement)
{
    // C 库函数 int strncmp(const char *str1, const char *str2, size_t n)
    // 把 str1 和 str2 进行比较,最多比较前 n 个字节
    if (strncmp(input_buffer->buffer, "insert", 6) == 0)
    {
        statement->type = STATEMENT_INSERT;
        // 读取输入的insert语句参数,这里硬编码(固定输入的是insert id username email)
        int args_assigned = sscanf(
            input_buffer->buffer, "insert %d %s %s",
            &(statement->row_to_insert.id),
            statement->row_to_insert.username,
            statement->row_to_insert.email);
        // 参数不够,记为错
        if (args_assigned < 3)
        {
            return PREPARE_SYNTAX_ERROR;
        }
        return PREPARE_SUCCESS;
    }
    
    // ...
}

We store those parsed arguments into a new Row data structure inside the statement object:

我们将这些解析的参数存储到语句对象内的新 Row 数据结构中:

#include <stdint.h>
#define COLUMN_USERNAME_SIZE 32
#define COLUMN_EMAIL_SIZE 255

typedef struct
{
    uint32_t id;
    char username[COLUMN_USERNAME_SIZE];
    char email[COLUMN_EMAIL_SIZE];

} Row;

typedef struct
{
    StatementType type;
    Row row_to_insert; // only used by insert statement
} Statement;

Now we need to copy that data into some data structure representing the table. SQLite uses a B-tree for fast lookups, inserts and deletes. We’ll start with something simpler. Like a B-tree, it will group rows into pages, but instead of arranging those pages as a tree it will arrange them as an array.

现在我们需要将该数据复制到表示表的某个数据结构中。SQLite使用B树进行快速查找,插入和删除。我们将从更简单的东西开始。像B树一样,它会将行分组到页面中,但不是将这些页面排列为树,而是将它们排列为一个数组。

Here’s my plan: 这是我的计划:

  • Store rows in blocks of memory called pages

将行存储在称为页的内存块中

  • Each page stores as many rows as it can fit

每个页面存储尽可能多的行

  • Rows are serialized into a compact representation with each page

行被序列化为每页的紧凑表示形式

  • Pages are only allocated as needed

仅根据需要分配页面

  • Keep a fixed-size array of pointers to pages

保留指向页面的固定大小的指针数组

First we’ll define the compact representation of a row:

首先,我们将定义一行的紧凑表示:

#define size_of_attribute(Struct, Attribute) sizeof(((Struct *)0)->Attribute)
const uint32_t ID_SIZE = size_of_attribute(Row, id);
const uint32_t USERNAME_SIZE = size_of_attribute(Row, username);
const uint32_t EMAIL_SIZE = size_of_attribute(Row, email);
const uint32_t ID_OFFSET = 0;
const uint32_t USERNAME_OFFSET = ID_OFFSET + ID_SIZE;
const uint32_t EMAIL_OFFSET = USERNAME_OFFSET + USERNAME_SIZE;
const uint32_t ROW_SIZE = ID_SIZE + USERNAME_SIZE + EMAIL_SIZE;

This means the layout of a serialized row will look like this:

这意味着序列化行的布局将如下所示:

We also need code to convert to and from the compact representation.

我们还需要代码来与紧凑表示进行转换。

void serialize_row(Row *source, void *destination)
{
    memcpy(destination + ID_OFFSET, &(source->id), ID_SIZE);
    memcpy(destination + USERNAME_OFFSET, &(source->username), USERNAME_SIZE);
    memcpy(destination + EMAIL_OFFSET, &(source->email), EMAIL_SIZE);
}

void deserialize_row(void *source, Row *destination)
{
    memcpy(&(destination->id), source + ID_OFFSET, ID_SIZE);
    memcpy(&(destination->username), source + USERNAME_OFFSET, USERNAME_SIZE);
    memcpy(&(destination->email), source + EMAIL_OFFSET, EMAIL_SIZE);
}

Next, a Table structure that points to pages of rows and keeps track of how many rows there are:

接下来,一个指向行页并跟踪行数的 Table 结构:

const uint32_t PAGE_SIZE = 4096;
#define TABLE_MAX_PAGES 100
const uint32_t ROWS_PER_PAGE = PAGE_SIZE / ROW_SIZE;
const uint32_t TABLE_MAX_ROWS = ROWS_PER_PAGE * TABLE_MAX_PAGES;

typedef struct
{
    uint32_t num_rows;
    void *pages[TABLE_MAX_PAGES];

} Table;

I’m making our page size 4 kilobytes because it’s the same size as a page used in the virtual memory systems of most computer architectures. This means one page in our database corresponds to one page used by the operating system. The operating system will move pages in and out of memory as whole units instead of breaking them up.

我将页面大小设为 4 KB,因为它与大多数计算机体系结构的虚拟内存系统中使用的页面大小相同。这意味着我们数据库中的一页对应于操作系统使用的一个页面。操作系统会将页面作为整个单元移入和移出内存,而不是分解它们。

I’m setting an arbitrary limit of 100 pages that we will allocate. When we switch to a tree structure, our database’s maximum size will only be limited by the maximum size of a file. (Although we’ll still limit how many pages we keep in memory at once)

我设置了一个任意限制,即我们将分配的 100 页。当我们切换到树结构时,数据库的最大大小将仅受文件最大大小的限制。(尽管我们仍然会限制一次在内存中保留的页面数)

Rows should not cross page boundaries. Since pages probably won’t exist next to each other in memory, this assumption makes it easier to read/write rows.

行不应跨越页面边界。由于页面在内存中可能不会彼此相邻存在,因此此假设使读取/写入行变得更加容易。

Speaking of which, here is how we figure out where to read/write in memory for a particular row:

说到这里,以下是我们如何确定特定行在内存中读取/写入的位置:

void *row_slot(Table *table, uint32_t row_num)
{
    uint32_t page_num = row_num / ROWS_PER_PAGE;
    void *page = table->pages[page_num];
    if (page == NULL)
    {
        // Allocate memory only when we try to access page
        page = table->pages[page_num] = malloc(PAGE_SIZE);
    }
    uint32_t row_offset = row_num % ROWS_PER_PAGE;
    uint32_t byte_offset = row_offset * ROW_SIZE;
    return page + byte_offset;
}

Now we can make execute_statement read/write from our table structure:

现在我们可以从表结构中进行 execute_statement 读/写:

ExecuteResult execute_insert(Statement *statement, Table *table)
{
    if (table->num_rows >= TABLE_MAX_ROWS)
    {
        return EXECUTE_TABLE_FULL;
    }
    Row *row_to_insert = &(statement->row_to_insert);
    serialize_row(row_to_insert, row_slot(table, table->num_rows));
    table->num_rows += 1;
    return EXECUTE_SUCCESS;
}

ExecuteResult execute_select(Statement *statement, Table *table)
{
    Row row;
    for (uint32_t i = 0; i < table->num_rows; i++)
    {
        deserialize_row(row_slot(table, i), &row);
        print_row(&row);
    }
    return EXECUTE_SUCCESS;
}

ExecuteResult execute_statement(Statement *statement, Table *table)
{
    switch (statement->type)
    {
    case (STATEMENT_INSERT):
        break;
        return execute_insert(statement, table);
    case (STATEMENT_SELECT):
        return execute_select(statement, table);
    }
}

Lastly, we need to initialize the table, create the respective memory release function and handle a few more error cases:

最后,我们需要初始化表,创建相应的内存释放函数并处理更多错误情况:

Table *new_table()
{
    Table *table = (Table *)malloc(sizeof(Table));
    table->num_rows = 0;
    for (uint32_t i = 0; i < TABLE_MAX_PAGES; i++)
    {
        table->pages[i] = NULL;
    }
    return table;
}

// 释放数据表空间
void free_table(Table *table)
{
    for (int i = 0; table->pages[i]; i++)
    {
        free(table->pages[i]);
    }
    free(table);
}
int main(int argc, char* argv[]) {
+  Table* table = new_table();
   InputBuffer* input_buffer = new_input_buffer();
   while (true) {
     print_prompt(); 
     switch (prepare_statement(input_buffer, &statement)) {
       case (PREPARE_SUCCESS):
         break;
+      case (PREPARE_SYNTAX_ERROR):
+        printf("Syntax error. Could not parse statement.\n");
+        continue;
       case (PREPARE_UNRECOGNIZED_STATEMENT):
         printf("Unrecognized keyword at start of '%s'.\n",
                input_buffer->buffer);
         continue;
     }

-    execute_statement(&statement);
-    printf("Executed.\n");
+    switch (execute_statement(&statement, table)) {
+      case (EXECUTE_SUCCESS):
+        printf("Executed.\n");
+        break;
+      case (EXECUTE_TABLE_FULL):
+        printf("Error: Table full.\n");
+        break;
+    }
   }
 }

With those changes we can actually save data in our database!

通过这些更改,我们实际上可以将数据保存在数据库中!

@@ -2,6 +2,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <stdint.h>

 typedef struct {
   char* buffer;
@@ -10,6 +11,105 @@ typedef struct {
 } InputBuffer;

+typedef enum { EXECUTE_SUCCESS, EXECUTE_TABLE_FULL } ExecuteResult;
+
+typedef enum {
+  META_COMMAND_SUCCESS,
+  META_COMMAND_UNRECOGNIZED_COMMAND
+} MetaCommandResult;
+
+typedef enum {
+  PREPARE_SUCCESS,
+  PREPARE_SYNTAX_ERROR,
+  PREPARE_UNRECOGNIZED_STATEMENT
+ } PrepareResult;
+
+typedef enum { STATEMENT_INSERT, STATEMENT_SELECT } StatementType;
+
+#define COLUMN_USERNAME_SIZE 32
+#define COLUMN_EMAIL_SIZE 255
+typedef struct {
+  uint32_t id;
+  char username[COLUMN_USERNAME_SIZE];
+  char email[COLUMN_EMAIL_SIZE];
+} Row;
+
+typedef struct {
+  StatementType type;
+  Row row_to_insert; //only used by insert statement
+} Statement;
+
+#define size_of_attribute(Struct, Attribute) sizeof(((Struct*)0)->Attribute)
+
+const uint32_t ID_SIZE = size_of_attribute(Row, id);
+const uint32_t USERNAME_SIZE = size_of_attribute(Row, username);
+const uint32_t EMAIL_SIZE = size_of_attribute(Row, email);
+const uint32_t ID_OFFSET = 0;
+const uint32_t USERNAME_OFFSET = ID_OFFSET + ID_SIZE;
+const uint32_t EMAIL_OFFSET = USERNAME_OFFSET + USERNAME_SIZE;
+const uint32_t ROW_SIZE = ID_SIZE + USERNAME_SIZE + EMAIL_SIZE;
+
+const uint32_t PAGE_SIZE = 4096;
+#define TABLE_MAX_PAGES 100
+const uint32_t ROWS_PER_PAGE = PAGE_SIZE / ROW_SIZE;
+const uint32_t TABLE_MAX_ROWS = ROWS_PER_PAGE * TABLE_MAX_PAGES;
+
+typedef struct {
+  uint32_t num_rows;
+  void* pages[TABLE_MAX_PAGES];
+} Table;
+
+void print_row(Row* row) {
+  printf("(%d, %s, %s)\n", row->id, row->username, row->email);
+}
+
+void serialize_row(Row* source, void* destination) {
+  memcpy(destination + ID_OFFSET, &(source->id), ID_SIZE);
+  memcpy(destination + USERNAME_OFFSET, &(source->username), USERNAME_SIZE);
+  memcpy(destination + EMAIL_OFFSET, &(source->email), EMAIL_SIZE);
+}
+
+void deserialize_row(void *source, Row* destination) {
+  memcpy(&(destination->id), source + ID_OFFSET, ID_SIZE);
+  memcpy(&(destination->username), source + USERNAME_OFFSET, USERNAME_SIZE);
+  memcpy(&(destination->email), source + EMAIL_OFFSET, EMAIL_SIZE);
+}
+
+void* row_slot(Table* table, uint32_t row_num) {
+  uint32_t page_num = row_num / ROWS_PER_PAGE;
+  void *page = table->pages[page_num];
+  if (page == NULL) {
+     // Allocate memory only when we try to access page
+     page = table->pages[page_num] = malloc(PAGE_SIZE);
+  }
+  uint32_t row_offset = row_num % ROWS_PER_PAGE;
+  uint32_t byte_offset = row_offset * ROW_SIZE;
+  return page + byte_offset;
+}
+
+Table* new_table() {
+  Table* table = (Table*)malloc(sizeof(Table));
+  table->num_rows = 0;
+  for (uint32_t i = 0; i < TABLE_MAX_PAGES; i++) {
+     table->pages[i] = NULL;
+  }
+  return table;
+}
+
+void free_table(Table* table) {
+  for (int i = 0; table->pages[i]; i++) {
+     free(table->pages[i]);
+  }
+  free(table);
+}
+
 InputBuffer* new_input_buffer() {
   InputBuffer* input_buffer = (InputBuffer*)malloc(sizeof(InputBuffer));
   input_buffer->buffer = NULL;
@@ -40,17 +140,105 @@ void close_input_buffer(InputBuffer* input_buffer) {
     free(input_buffer);
 }

+MetaCommandResult do_meta_command(InputBuffer* input_buffer, Table *table) {
+  if (strcmp(input_buffer->buffer, ".exit") == 0) {
+    close_input_buffer(input_buffer);
+    free_table(table);
+    exit(EXIT_SUCCESS);
+  } else {
+    return META_COMMAND_UNRECOGNIZED_COMMAND;
+  }
+}
+
+PrepareResult prepare_statement(InputBuffer* input_buffer,
+                                Statement* statement) {
+  if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
+    statement->type = STATEMENT_INSERT;
+    int args_assigned = sscanf(
+    input_buffer->buffer, "insert %d %s %s", &(statement->row_to_insert.id),
+    statement->row_to_insert.username, statement->row_to_insert.email
+    );
+    if (args_assigned < 3) {
+    return PREPARE_SYNTAX_ERROR;
+    }
+    return PREPARE_SUCCESS;
+  }
+  if (strcmp(input_buffer->buffer, "select") == 0) {
+    statement->type = STATEMENT_SELECT;
+    return PREPARE_SUCCESS;
+  }
+
+  return PREPARE_UNRECOGNIZED_STATEMENT;
+}
+
+ExecuteResult execute_insert(Statement* statement, Table* table) {
+  if (table->num_rows >= TABLE_MAX_ROWS) {
+     return EXECUTE_TABLE_FULL;
+  }
+
+  Row* row_to_insert = &(statement->row_to_insert);
+
+  serialize_row(row_to_insert, row_slot(table, table->num_rows));
+  table->num_rows += 1;
+
+  return EXECUTE_SUCCESS;
+}
+
+ExecuteResult execute_select(Statement* statement, Table* table) {
+  Row row;
+  for (uint32_t i = 0; i < table->num_rows; i++) {
+     deserialize_row(row_slot(table, i), &row);
+     print_row(&row);
+  }
+  return EXECUTE_SUCCESS;
+}
+
+ExecuteResult execute_statement(Statement* statement, Table *table) {
+  switch (statement->type) {
+    case (STATEMENT_INSERT):
+           return execute_insert(statement, table);
+    case (STATEMENT_SELECT):
+    return execute_select(statement, table);
+  }
+}
+
 int main(int argc, char* argv[]) {
+  Table* table = new_table();
   InputBuffer* input_buffer = new_input_buffer();
   while (true) {
     print_prompt();
     read_input(input_buffer);

-    if (strcmp(input_buffer->buffer, ".exit") == 0) {
-      close_input_buffer(input_buffer);
-      exit(EXIT_SUCCESS);
-    } else {
-      printf("Unrecognized command '%s'.\n", input_buffer->buffer);
+    if (input_buffer->buffer[0] == '.') {
+      switch (do_meta_command(input_buffer, table)) {
+        case (META_COMMAND_SUCCESS):
+          continue;
+        case (META_COMMAND_UNRECOGNIZED_COMMAND):
+          printf("Unrecognized command '%s'\n", input_buffer->buffer);
+          continue;
+      }
+    }
+
+    Statement statement;
+    switch (prepare_statement(input_buffer, &statement)) {
+      case (PREPARE_SUCCESS):
+        break;
+      case (PREPARE_SYNTAX_ERROR):
+    printf("Syntax error. Could not parse statement.\n");
+    continue;
+      case (PREPARE_UNRECOGNIZED_STATEMENT):
+        printf("Unrecognized keyword at start of '%s'.\n",
+               input_buffer->buffer);
+        continue;
+    }
+
+    switch (execute_statement(&statement, table)) {
+    case (EXECUTE_SUCCESS):
+        printf("Executed.\n");
+        break;
+    case (EXECUTE_TABLE_FULL):
+        printf("Error: Table full.\n");
+        break;
     }
   }
 }
~ ./db
db > insert 1 cstack foo@bar.com
Executed.
db > insert 2 bob bob@example.com
Executed.
db > select
(1, cstack, foo@bar.com)
(2, bob, bob@example.com)
Executed.
db > insert foo bar 1
Syntax error. Could not parse statement.
db > .exit
~

有坑,弃了


  转载请注明: malred-blog a

 上一篇
UdemyMachineLearningDeepLearningAndBayesianLearning UdemyMachineLearningDeepLearningAndBayesianLearning
01 - Introductioninstall something to get start至少需要 python3.6 google drive google colab他使用的编辑器是 vscode 01.04 Jupyter Not
2023-06-10
下一篇 
用go语言实现编译器 用go语言实现编译器
1. 编译器与虚拟机 正如“解释器”和“web 服务器”有多种实现一样,编译器和虚拟机本质上也是某种思想(模式),掌握核心思想,可以更好地理解虚拟机和编译器 1.1 编译器 可以编译的内容不止编程语言,还有正则表达式、数据库查询甚至 HT
  目录